خوشه بندی سوپر مارکت ها : نقش کارشناسان
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|2992||2006||17 صفحه PDF||سفارش دهید||8410 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Retailing and Consumer Services, Volume 13, Issue 4, July 2006, Pages 231–247
This work is part of a supermarket chain expansion study and is intended to cluster the existent outlets in order to support the evaluation of outlet performance and new outlet site location. To overcome the curse of dimensionality (a large number of attributes for a very small number of existing outlets) experts’ knowledge is considered in the clustering process. Three alternative approaches are compared for this end, the experts being required to: (1) a priori: provide values for perceived dissimilarities between pairs of outlets; (2) a posteriori: evaluate results from alternative regression trees; (3) interactively: help to select base variables and evaluate results from alternative dendrograms. The later approach provided the best results according to the marketing experts.
As in Europe, the retail sector in Portugal is going through a restructuring phase. Several authors (e.g. Birkin et al., 2002; Dawson, 2000; Seth and Randall, 1999) identify such factors as increasing consumer mobility, increasing electronic commerce, changing household size, concentration of market power, home market saturation, and changes in planning legislation to justify the new trends in retailing. In the food retail, in particular, after an unprecedented period of hypermarkets growth, since the late 1970s, both in number and market share, it is now clear that hypermarket activity has slowed down significantly on behalf of the small or medium supermarkets (chain outlets including discount and hard discount chains) that nowadays present a larger dynamism. In Portugal, market share data shows that since 1996 the supermarkets were the only ones to grow simultaneously in the number of outlets and in the volume of sales and, consequently, to increase the market share from 28% to 34% in the A.C. Nielsen universe. In 1997, the supermarkets reached the leadership and consolidated its expansion strategy. According to the most recent data, in 2001, supermarkets’ sales were already broadly superior to the sales in hypermarkets: 47% against just 35% of the total sales of outlets with alimentary products (see Fig. 1).This change in food outlet type is also found in other European countries (Birkin et al., 2002). Much more stringent legislation and the fact that consumers are more demanding, force the retail groups to invest in outlets of smaller dimension, and so in a proximity and quality of goods and services strategy. This investment has a longer run return as well as smaller economies of scale, which forces careful decision-making (McGoldrick, 2000; Salvaneschi, 1996). Because smaller outlets are heterogeneous in aspects as location, dimension, and client behaviour, the definition of outlet clusters is essential in outlet performance and site evaluation. In 2, 3 and 4 the next sections, an empirical classification of variables used in measuring supermarkets performance and the role of expert knowledge in clustering and marketing applications is presented. The data collection phase is described and three approaches for experts’ knowledge integration in supermarket cluster are explained. In Section 5 these approaches are compared and a cluster profiling is presented. This paper finishes with conclusions and a methodological discussion.
نتیجه گیری انگلیسی
When a large number of variables are available for clustering a small amount of observations, the need to integrate experts’ knowledge in the clustering process becomes particularly relevant. In order to cluster a small number of supermarkets with a large number of available attributes three alternative approaches are presented which integrate experts’ knowledge: the a priori, the a posteriori and the interactive. According to the analysts’ expectations, the a priori approach should integrate the relevant experts’ knowledge concerning the clustering of supermarkets as it is based on the perceived dissimilarities between the supermarkets. In the a posteriori approach experts’ knowledge was required in order to select among alternative regression trees. Finally, the integration of experts’ knowledge both in the choice of base variables for clustering and in the selection of results was expected to give a larger role to the experts. According to the experts’ perspective, some advantages and disadvantages of the three alternative approaches may be pointed: • In the a priori approach the paired comparisons task was found to be very demanding and the results were poor. • Regression trees used as a clustering tool in the a posteriori approach where found to be very attractive. Regression trees promoted the communication between the experts and the analysts as they simultaneously provide clusters and comprehensible descriptions. • The interactive approach made the clustering process more transparent, leading to the chosen clustering results. It also allowed the identification of outliers. However, the process was considered very costly. In the a priori approach sales-related variables where expected to explain the perceived dissimilarities between the supermarkets since sales turnover is generally accepted as a major evaluation measure for comparing outlets performance. As it was not the case, some hypothesis may be raised which refer to the complexity of the comparative outlet evaluation task. In fact, as it was already stated, location and supermarket performance evaluation involves large numbers of attributes, which may turn measures of perceived dissimilarities between supermarkets insufficient for clustering purposes. In order to better integrate diversity contained in the concept of supermarket performance several clustering base variables should be considered for selection, the interactive approach being more appropriate for this purpose. From the a posteriori approach, experts where quite enthusiastic about the use of regression trees but, they did not pick its results to be the “best”. In fact, this is a very instable approach when it refers to small data sets, which call for extremely careful external validation (Bay and Pazzani, 2000). However, considering that this clustering process was widely accepted by users, it should be further researched taking into account two main guidelines: • The role of experts should be reinforced and should allow for interactive choice of surrogate variables. • The choice of the appropriate target variable should be carefully conducted. For this end multiple criteria decision analysis may be considered in order to build a performance measure more adapted to expert's outlet evaluation. Alternatively, several trees with different target variables may be grown and the corresponding results combined in a consensus tree (see Lapointe and Cucumel, 2002; Leclerc, 1998) or using voting techniques well-known in the machine learning literature (Duda et al., 2001). The interactive approach yielded the most satisfactory outlets’ typology. Although, being very time consuming this approach simultaneously invested in a trust construction process. Thus, the analysts concluded that results were easily accepted, as the experts understood the techniques strengths and weaknesses better. This approach minimizes what is known in Decision Support Systems terminology as the “black box effect” (Adelman and Riedel, 1997) being similar to an expert visual validation methodology as the three-step-method mentioned in Hennig and Christlieb (2002), but tailored for a very high-dimensional data set. In addition, Wang (2001) uses a similar procedure and identifies two supporting arguments. First, it uses the entire data set, in contrast to cross-validation methodologies, so that information is not lost. Second a satisfactory result can always be obtained in contrast to dead end procedures that offer non-alternative result if the validation fails. Several clustering base variables were considered for selection in the interactive approach, but only two variables were selected as base cluster variables. Although this may appear to be in conflict with the richness of information that could be considered to characterize the supermarkets, some remarks may be added: • The two chosen variables are very different in nature being collected by distinct processes. They are also not correlated or related in any way. • Several trials were made considering a larger number of base cluster variables but the experts could not find any improvement in the results. • It can be argued that the use of more than necessary variables can be misleading as it can mask the existence of clusters in the data, introducing noise in the results. In fact, several authors (see Gnanadesikan, 2001; Milligan, 1996 or Gordon, 1999) underline the role of feature selection and extraction for clustering and argue that the bias should not be to include variables without additional information (Duda et al., 2001). • Additionally, the remaining available data must be used in cluster interpretation and validation, which is an absolutely necessary phase to confirm the correctness of the defined typology and to characterize the groups (which should not be made with only base cluster variables). In the present application, the small number of observations and the “curse of dimensionality” increased the relevance of experts’ knowledge integration in the process of clustering. According to this study experts’ knowledge integration should be considered in all stages of the clustering process, mainly in selection of base variables and also in the selection among alternative clustering results. The supermarket typology that was obtained is already being used for differentiating marketing actions. Thus, the frequent gap between theory and practice was overcome and the last stage of the clustering validation process, actually using it, was reinforced.