الگوبرداری افتراقی بهره وری ترویجی با مدل سازی یادگیری ماشین (II) : کاربردهای عملی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|1363||2012||15 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 39, Issue 17, 1 December 2012, Pages 12784–12798
The assessment of promotional sales with models constructed by machine learning techniques is arousing interest due, among other reasons, to the current economic situation leading to a more complex environment of simultaneous and concurrent promotional activities. An operative model diagnosis procedure was previously proposed in the companion paper, which can be readily used both for agile decision making on the architecture and implementation details of the machine learning algorithms, and for differential benchmarking among models. In this paper, a detailed example of model analysis is presented for two representative databases with different promotional behaviour, namely, a non-seasonal category (milk) and a heavily seasonal category (beer). The performance of four well-known machine learning techniques with increasing complexity is analyzed in detail here. In particular, k-Nearest Neighbours, General Regression Neural Networks, Multilayer Perceptron (MLP), and Support Vector Machines (SVM), are differentially compared. Present paper evaluates these techniques along the experiments described for both categories when applying the methodological findings obtained in the companion paper. We conclude that some elements included in the architecture are not essential for a good performance of the machine learning promotional models, such as the semiparametric nature of the kernel in SVM models, whereas other can be strongly dependent of the database, such as the convenience of multiple output models in MLP regression schemes. Additionally, the specificity of the behaviour of certain categories and product ranges determines the need to establish suitable and specific procedures for a better prediction and feature extraction.
Nowadays, the context in which grocery retailers have to make their decisions is subjected to a high degree of complexity and uncertainty. These circumstances are especially difficult at this particular moment given the current economic situation, which undoubtedly influences the decision processes in relation to price positioning and sales promotions strategies. In this sense, retailers must be very careful and should evaluate different aspects related to the particular context of prices and sales promotions, such as specific features of the categories and brands carried by them (Voss & Seiders, 2003). From this perspective, and although in recent years research in this area has increased considerably (Blattberg and Neslin, 1990, Levy et al., 2004, Martínez-Ruiz et al., 2008 and Voss and Seiders, 2003), some questions remain unsolved. For example, some studies show the difficulty to relate properly sales promotions and price discounts. In particular, authors have identified that, under certain circumstances, it is especially difficult to establish optimal pricing. This is particularly the case when simultaneous effect are taking place and hence stressing margins and profits (Tellis & Zufryden, 1995). Additionally, promotional activities may have discordant objectives among different elements within the sales chain, and this fact may eventually have a direct impact on decision making strategy (Sayman & Raju, 2004). For instance, manufacturers may be trying to raise sales, or trying to enhance brand recognition, whilst retailers are focusing on maximising efficiency, and total benefits of a whole category instead of a solely product or brand. During decades, important efforts have been made to better understand the dynamics in sales promotion. Initially, these analyses were based on classical statistical methods, and now, more and more, important developments on machine learning and data mining techniques are being developed. The machine learning techniques have the objective to find repetitive patterns, trends or rules, which can explain data behaviour at a given context, allowing to extract new knowledge on the consumer behaviour, to improve the performance of marketing operations, and to estimate the deal effect curve (DEC). Though a vast amount of knowledge has been obtained in this setting from machine learning techniques, there are still promotional behaviours that have not been studied in detail sufficiently. Hence, a deeper analysis on sales promotion characterization, based on empirical methods, needs to be addressed (Bell et al., 1999, Blatterg et al., 1995 and Leeflang and Wittingk, 2000). There are a number of operational issues that need to be considered when machine learning techniques are to be applied to promotional modelling (Liu, Kong, & Yang, 2004; Martínez-Ruiz, Mollá-Descals, & Rojo-Álvarez, 2006; Martínez-Ruiz, Mollá-Descals, Gómez-Borja, & Rojo-Álvarez, 2006a; Wang et al., 2008 and Van Heerde et al., 2001): (1) heavy tails and heteroscedasticity for the prediction residuals yield to Gaussianity as a residual property not always to be assumed; (2) actual risk in merit figures has to be properly taken into account throughout all the machine design process; (3) it is not always easy to set a cut-off test for results evaluation. For these reasons, the companion paper (Soguero-Ruiz, Gimeno-Blanes, Mora-Jiménez, Martínez-Ruiz, & Rojo-Álvarez, 2012) presented a simple nonparametric statistical tool, based on the paired bootstrap resampling for establishing clear statistical comparisons among methods. Additionally, the companion paper analysed the assumptions, the method, and the steps, that should be applied to properly utilise the learning-from-samples technique for promotional characterization. This study allowed us to analyze and evaluate different models in terms of averaged and scatter characterizations of merit figures for the distribution of the actual risk. As a positive result in (Soguero-Ruiz et al., 2012), it should be mentioned that the free parameter tuning procedure was strongly independent from an important number of performance measurements (index of agreement D – D index, Mean Absolute Error – MAE, or Relative Mean Absolute Error – RMAE), which was a demonstration of the powerfulness of the method for making statistical comparisons in this setting. This paper presents a set of practical applications for the proposed methods and systematic benchmarking among several relevant and well-known different learning techniques. Results are shown for two representative databases with different promotional behaviour, namely, a non-seasonal stable category (milk) and a heavily seasonal category (beer). Four well-known machine learning algorithms with increasing complexity are benchmarked differentially, specifically, k-Nearest Neighbours (k-NN), General Regression Neural Networks (GRNN), Multilayer Perceptron (MLP), and Support Vector Machines (SVM). Subsequent experiments are devoted to explore the actual performance improvement obtained for machine design architecture in MLP and SVM, and the procedure is stated for feature selection analysis criteria. The draw of the paper is as follows. Section 2 presents a description of the machine learning techniques analyzed in this work, and a short summary of the method proposed in the companion paper (Soguero-Ruiz et al., 2012). Afterward, the two databases to be used for sales promotion modelling are described (milk and beer category) in Section 3. Section 4 includes the results of the four experiments (A, B, C and D) and the analysis performed on both databases. Experiment A is devoted to a detailed comparison of the performance for the different techniques on each data set. Experiment B shows the performance of different elements in the MLP architecture design, whereas Experiment C deals with the kernel architecture in SVM estimation modelling design. Last, experiment D, gives a principled approach to feature selection using the paired bootstrap test in the merit figures. Finally, in Section 5, discussion is presented and concluding remarks are summarized.
نتیجه گیری انگلیسی
In this study, we can distinguish two separate types of conclusions. First, those ones related to the evaluation of the different methods, and secondly, those related to promotion of a specific product. From a marketing point of view, it has been evidenced that it is essential to better understand not only the consumer behaviour in terms of their response to price deals, but also an in depth study of the adequate methodologies. This is especially relevant in the current situation, where the economic crisis brought along noticeable reductions in consumers budgets, which together with an increase and intensification of competitivity in retailer’s market, multiplied the final effect over final retailers revenues and margins. These effects are supported by the changes in consumer behaviour in terms of type of products, volumes, and migration to retailers’ private brand. For this reason, it is not surprising that the initial efforts made some decades ago, to more accurately reflect how consumers respond to price deals and promotions, are now evolving to more dynamic techniques. Thus, the tendency is to evolve to nonparametric and semiparametric regression methods, which allow more flexibility and a higher ability to adapt to the specific promotional features. This is really important in the scenario of the present study, where two databases corresponding to different food product categories with specific characteristics. Milk is a daily used product, while beer has a high level of seasonality. On the other hand, milk is consumed by a higher range of consumers segments, whereas beer is consumed by adults. Main contribution of this paper, together with the companion paper, is the proposal of an operative method to evaluate the promotional efficiency, based on machinery learning, as a valid method to analyse the multiple and simultaneous effects coexisting in promotional activities in retail markets when using real data. Justification for the use of machinery learning tools is based on the fact that real data are subjected to a large number of factors (namely, consumer idiosyncratic behaviour, multiple simultaneous promotions in the same category, promotions in complementary/substitute products and categories, and even consumer share of wallet), and hence simple models would not be able to trace all concurrent events or even extract complex multivariable characteristics. As far as method evaluation is concerned, initial results showed that very often it comes complicated to identify significant differences in the model quality for all four machinery learning techniques presented (k-NN, GRNN, MLP and SVM). As a consequence, we proposed an operative statistical method based on bootstrap resampling. Final results showed that SVM presented a significant better performance, followed by k-NN and GRNN, for milk category. For beer category, results were also better in general terms for SVM, although in some cases a better result was obtained using k-NN. A second finding was that no significant difference was obtained in the comparison between the performances of υυ-SVM with Gaussian kernel vs semiparametric kernel, for a weekly modelling. In terms of the MPL architecture, no general conclusion could be stated from the multiple and model related results obtained, although particular behaviours were identified for different categories and different brands. The first difference was the best performance when multiple output MLP architecture is selected for the milk database (a dairy products category). This option provided a clear better behaviour for certain products (premium products), and did not offer a worse response to the rest of the products (non-premium). When considering the beer database (a non-daily product category), using single output MLP was the best choice, due to the fact that this methodology provided a better behaviour for certain products, and this model did not offer a worse response for any other model within the category. Similar results were also found in the case of the analysis performed to elaborate on the convenience of including certain additional informative variables such as dichotomic variables (Base Line or direct discount) for the feature selection procedure for the milk database, to evaluate if these variables added more knowledge on the models. In this regard, we conclude that no inclusion of Base Line in any of the models worsens the performance, especially in the case of premium products. As a general conclusion, we can state that bootstrap resampling allows benchmarking statistically the performance of the different methods discussed here. However, the diagnosis and selection of variables are influenced by promotional method characteristics that are being trying to be analyzed, as well as the specific nature of the product. So, it cannot be drawn overall conclusions for both products and categories simultaneously, and it is necessary not to rely on one single method when building promotional estimates from real data, although the results showed sufficient relation to categories and groups of brands, to infer a certain correlation. This work represents a starting point for future research that can be oriented towards improving the proposed method or extending the results and methods. In particular, the studies developed for milk and beer, could be extended to a wider number of categories, to determine whether a higher grouping scheme, would eventually allow to validate some of the proposed conclusions on a wider scope. New categories, such as perfumery, perishable product, and others, could be also analysed. And finally, new approaches could be also explored by introducing a priori information, such as environmental or idiosyncratic variables. Additionally, it could be explored the performance of these techniques over other processes through the value chain, such as manufacturer promotional activities toward retailer to end customer.