پشتیبانی از تصمیم بازاریابی مستقیم از طریق پیش بینی مدل سازی پاسخ به مشتری
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|23580||2012||9 صفحه PDF||سفارش دهید||7440 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Decision Support Systems, Volume 54, Issue 1, December 2012, Pages 443–451
Decision support techniques and models for marketing decisions are critical to retail success. Among different marketing domains, customer segmentation or profiling is recognized as an important area in research and industry practice. Various data mining techniques can be useful for efficient customer segmentation and targeted marketing. One such technique is the RFM method. Recency, frequency, and monetary methods provide a simple means to categorize retail customers. We identify two sets of data involving catalog sales and donor contributions. Variants of RFM-based predictive models are constructed and compared to classical data mining techniques of logistic regression, decision trees, and neural networks. The spectrum of tradeoffs is analyzed. RFM methods are simpler, but less accurate. The effect of balancing cells, of the value function, and classical data mining algorithms (decision tree, logistic regression, neural networks) are also applied to the data. Both balancing expected cell densities and compressing RFM variables into a value function were found to provide models similar in accuracy to the basic RFM model, with slight improvement obtained by increasing the cutoff rate for classification. Classical data mining algorithms were found to yield better prediction, as expected, in terms of both prediction accuracy and cumulative gains. Relative tradeoffs among these data mining algorithms in the context of customer segmentation are presented. Finally we discuss practical implications based on the empirical results.
The role of decision support techniques and models for marketing decisions has been important since the inception of decision support systems (DSSs) . Diverse techniques and models (e.g., optimization, knowledge-based systems, simulation) have emerged over the last five decades. Many marketing domains, including pricing, new product development, and advertising, have benefited from these techniques and models . Among these marketing domains, customer segmentation or profiling is recognized as an important area , ,  and . There are at least two reasons for this. First, the marketing paradigm is becoming customer-centric  and targeted marketing and service are suitable. Second, unsolicited marketing is costly and ineffective (e.g., low response rate)  and . Along with these reasons, there are increasing efforts on collecting and analyzing customer data for better marketing decisions ,  and . The advancement of online shopping technologies and database systems has accelerated this trend. Data mining has been a valuable tool in this regard. Various data mining techniques, including statistical analysis and machine learning algorithms, can be useful for efficient customer segmentation and targeted marketing ,  and . One such technique is RFM, standing for recency, frequency, and monetary. RFM analysis has been used for marketing decisions for a long time and is recognized as a useful data mining technique for customer segmentation and response models  and . A survey  also shows that RFM is among the most popular segmentation and predictive modeling techniques used by marketers. RFM relies on three customer behavioral variables (how long since the last purchase by customer, how often the customer purchases, how much the customer has bought) to find valuable customers or donors and develop future direct marketing campaigns. Having a reliable and accurate customer response model is critical for marketing success since an increase or decrease in accuracy of 1% could have a significant impact on their profits . While there could be many other customer-related factors [e.g.,42], previous studies have shown that RFM alone can offer a powerful way of predicting the future customer purchase ,  and . Our research builds customer response models using RFM variables and compares them in terms of customer gains and prediction accuracy. The paper aims to increase understanding of how to find knowledge hidden in customer and transactional databases using data mining techniques. This area is called knowledge-based marketing . The next section briefly reviews various data mining techniques for building customer response or predictive models. Section 3 describes methodology. All the response models will be built upon the three RFM variables, while different data mining techniques are used. Then, we present a research design, including two direct marketing data sets with over 100,000 observations, a process of predictive modeling building, and methods to measure the performance of models. Section 4 includes analysis and results. There could be different methods to increase the prediction performance of an RFM-based predictive model and sophisticated data mining techniques (decision tree, logistic regression, and neural networks) appear to outperform more traditional RFM. These findings are further discussed in Section 5, comparing results with previous studies of customer response models and in the broad contexts of knowledge-based marketing. We also discuss practical implications from the findings and offer conclusions. The contribution of this study is to demonstrate how RFM model variants can work, and supports general conclusions consistently reported by others that RFM models are inferior to traditional data mining models. This study shows that RFM variables are very useful inputs for designing various customer response models with different strengths and weaknesses and the ones relying on classical data mining (or predictive modeling) techniques can significantly improve the prediction capability in direct marketing decisions. These predictive models using RFM variables are simple and easy to use in practice than those with a complex set of variables. Besides descriptive modeling techniques popular in practice , thus, marketers should adopt those advanced predictive models in their direct marketing decisions.
نتیجه گیری انگلیسی
Marketing professionals have found RFM to be quite useful , ,  and , primarily because the data is usually at hand and the technique is relatively easy to use. However, previous research suggests that it is easy to obtain a stronger predictive customer response model with other data mining algorithms [e.g., 1, 19, 20, 24]. RFM has consistently been reported to be less accurate than other forms of data mining models, but that is to be expected, as the original RFM model segmenting customers/donors into 125 cells and is prescriptive rather than predictive. That expected result was confirmed in this research. RFM helped nicely structure millions of records in each dataset into 125 groups of customers using only three variables. The model offers a well-organized description of people based on their past behaviors, which helps marketers effectively identify valuable customers or donors and develop a marketing strategy. However, this descriptive approach is less accurate in predicting future behavior than more complex data mining models. There have been proposed improvements to RFM. In the models seeking to improve RFM, our study showed that increasing the cutoff limit will lead to improvement in prediction accuracy. However, RFM models at any cutoff limit have trouble competing with degenerate models. Degenerate models have high predictive accuracy for highly skewed datasets, but provide no benefit as they simply conclude it is not worth promoting to any customer profile. Balancing cell sizes by adjusting the limits for the three RFM variables is sound statistically, but did not lead to improved accuracy in our tests. In both Study 1 and Study 2, the basic RFM model significantly underperformed other predictive models, except the V function model in Study 1. These results indicate that balancing cells might help improve fit, but involves significant data manipulation for very little predictive improvement in the data set we examined. Using the V ratio is an improvement to RFM that is useful in theory, but in our tests the results are mixed. In Study 1, the technique did not provide better predictive accuracy. In Study 2, it did yield an improved classification rate but underperformed the degeneracy model. Thus, this technique deserves a further inquiry. Overall, the results above indicate that some suggested alternatives to the traditional RFM have limitations in prediction. The primary conclusion of our study, as was expected, is that classical data mining algorithms outperformed RFM models in terms of both prediction accuracy and cumulative gains. This is primarily because decision tree, logistic regression, and neural networks are often considered the benchmark “predictive” modeling techniques , ,  and . The demand of predictive modeling or analytics is in high demand in many industries , including direct marketing . This implies that marketers can make more effective marketing decisions by embracing advanced predictive modeling techniques, besides popular descriptive models. It often is the case that decision tree, logistic regression, and neural networks vary in their ability to fit specific sets of data . Furthermore, there are many parameters that can be used with neural network models and decision trees. All three of these model types have the advantage of being able to consider external variables in addition to R, F, and M. Here, we applied them to these three variables without adding other explanatory variables. All three model types did better than the degenerate case, or any of the other variants we applied. The best overall predictive fit was obtained using the decision tree model. This model also outperformed other predictive models in cumulative gains in both studies. Decision tree tends have advantages over low dimensionality datasets  like those used in this research. This characteristic of decision tree may explain this result. Thus, we do not contend that decision tree will always be best. However, there is a major relative value for decision trees that they provide an easily understandable model. For example, Table 10 presents the decision tree rule sets obtained in Study 1, which amounts to enumerating ranges of R that had high densities of response. There was only one range where M was used (R = 2449 to R = 2451). And looking at Table 10, the fit would have been improved if the decision tree had not differentiated and called all of these cases Yes. (There would have been 4 fewer errors out of 20,000, yielding essentially the same fit with the same correct response of 0.984.) There is the downside for decision trees that they often overfit the data (as they did in Table 10), and can yield an excessive number of rules for users to apply. Table 15 presents a comparison of methods based on inferences from our two studies. While our study uses predication accuracy along with cumulative gains for model comparison, in practice the type of error can be considered in terms of relative costs, thus enabling influence on profit. For example, our study shows that increasing the cutoff level between predicting response or not can improve correct classification. However, a more precise means to assess this would be to apply the traditional cost function reflecting the cost of the two types of error. This is to be a consideration in evaluating other predictive models as well. Thus, specific models should be used in light of these relative costs. The good performance of those data mining methods (particularly decision tree), in terms of prediction accuracy and cumulative gains, indicates that three variables (R, F, and M) alone can be useful for building a reliable customer response model. This echoes the importance of RFM variables in understanding customer purchase behavior and developing response models for marketing decisions , ,  and . Previous research [e.g., 1] also shows that inclusion of non-RFM attributes (e.g., income) is likely to slightly improve the model performance. However, a sophisticated model with too many variables is not very effective for marketing practitioners  and reducing variables is important for practical use of predictive models . Marketers should be aware of this tradeoff between a simple model (with fewer variables) and a sophisticated model (with a large number of variables) and develop a well-balanced model using their market and product knowledge. To repeat the contributions of this paper given in the Introduction, we have demonstrated how RFM models and variants can be implemented. RFM models have the relative advantage that they are simple in concept, and thus understandable to users. However, they can easily be improved in terms of predictive accuracy (or profitability, given situational data) by using classical data mining models. Of these traditional data mining models, decision trees are especially attractive in that they have easily understood output. These advanced predictive models are much beneficial in the practice of direct marketing since they can use only three behavioral input variables and generate the results significantly better than the traditional RFM model and other variants.