بهینه سازی اصولی از مشکلات داده کاوی به منظور بهبود عملکرد مدل: برنامه مستقیم بازاریابی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|22064||2005||11 صفحه PDF||سفارش دهید||6268 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 29, Issue 3, October 2005, Pages 630–640
Although most data-mining (DM) models are complex and general in nature, the implementation of such models in specific environments is often subject to practical constraints (e.g. budget constraints) or thresholds (e.g. only mail to customers with an expected profit higher than the investment cost). Typically, the DM model is calibrated neglecting those constraints/thresholds. If the implementation constraints/thresholds are known in advance, this indirect approach delivers a sub-optimal model performance. Adopting a direct approach, i.e. estimating a DM model in knowledge of the constraints/thresholds, improves model performance as the model is optimized for the given implementation environment. We illustrate the relevance of this constrained optimization of DM models on a direct-marketing case, i.e. in the field of customer relationship management. We optimize an individual-level response model for specific mailing depths (i.e. the percentage of customers of the house list that actually receives a mail given the mailing budget constraint) and compare its predictive performance with that of a traditional response model, neglecting the mailing depth during estimation. The results are in favor of the constrained-optimization approach.
Firms typically make marketing decisions in a constrained business as well as operating environment. For instance, a financial-services provider planning a cross-sell action knows in advance that it is not optimal to propose more than 1, 2,…, or k services to the customers being targeted. Similarly, an online retailer might want to recommend a limited set of k products on the visitor's customized welcome page. Today, most companies in the pharmaceutical industry are facing the problem of drugs for which the patent will expire in the near future. In an attempt to compete with the generic substitutes for these medicines, the pharmaceutical company might opt to send its sales team to k physicians currently prescribing their patented drug. Given this high cost/impact marketing strategy, it may not be profitable to visit all doctors currently prescribing the drug. For instance, the sales team could visit only doctors whose net expected LTV is higher than a certain threshold. A hypermarket wants to create store-traffic by mailing a reduction coupon for a certain product. Given the stock limits, the hypermarket will have to make a selection of k customers to mail the coupon in order to meet demand ( Buckinx, Moons, Van den Poel, & Wets, 2004). Similarly, management of a supermarket loyalty program may want to limit defection of their k% top customers ( Buckinx & Van den Poel, 2005). Finally, a car manufacturer wants to promote a new sports car by inviting prospects to join a professional racer in making laps on a circuit on a specified day. Given the time constraint (i.e. only one day) and the availability of a single free seat, the manufacturer can only invite a limited number of k prospects. These examples clearly indicate that a lot of marketing problems heed practical constraints resulting in implementation limitations. Mostly, these constraints are facts known beforehand. Nevertheless, in practice, data-mining models used to solve these problems are commonly applied ignoring the constraints/thresholds. When it comes to implementation, the solution is restricted to fit within the practical boundaries. In this paper, we advocate a direct approach estimating a data-mining (DM) model in awareness of the constraints/thresholds. The constrained-optimization perspective on data mining is (usually) superior to the indirect approach (i.e. estimate general DM model and restrict solution afterwards to obey the implementation limitations) as it results in higher model performance. The latter is due to directly tailoring (i.e. optimizing) the DM model to fit the implementation environment at hand. We illustrate this constrained optimization of DM models on a target-selection problem faced by a direct mailer heeding budget constraint. The remaining of this paper is structured as follows. In Section 2, we outline how the constrained optimization applies to direct-marketing problems. We note that the focus of the illustration of constrained optimization of DM models will be on target-selection DM problems given budget constraints. In Section 3, we introduce a constrained optimization of a binary response DM model. Employing weighted maximum likelihood to estimate a binary logistic regression model, we directly optimize the response model for a specific mailing depth stemming from a budget constraint. Section 4 describes the application details and specifies how the new constrained optimization of the response DM model is applied on our real-life direct-marketing case. We compare the model accuracy of the indirect response model to that of our new innovative response model optimized for a given mailing depth (i.e. implementation constraint). We conclude in Section 5 by discussing the main findings and suggesting some avenues for further research.
نتیجه گیری انگلیسی
5. Conclusion In this paper, we expressed our doubts on the common practice of building DM models neglecting constraints/thresholds to subsequently restrict the solution to fit within the practical boundaries. We advocated a direct approach, i.e. estimating a DM model in knowledge of the constraints/thresholds, and hypothesized an improved model performance as the DM model is tailored to fit the implementation environment at hand. This hypothesis was tested on a target-selection DM problem given budget constraint. Employing weighted maximum likelihood to estimate a binary logistic regression, we directly optimize the response model for a specific mailing depth stemming from a budget constraint. The results largely confirm our hypothesis. For mailing depths up to 48%, the constraint-optimized binary logistic regression is superior to a traditional binary logistic regression. The decreasing performance difference with increasing mailing depths is in line with the results of Bhattacharyya (1999) and could be expected as greater mailing depths imply larger proportions of customers leaving less freedom for the optimized approach to manipulate. As our direct approach reveals high performance at small mailing depths, a great opportunity lies in the application of our approach for the promotion of exclusive products/services (e.g. promotion of a new luxury, limited edition motorbike), for selecting customers to offer a reward limited in number (e.g. tickets for a concert) or for selection of persons with a specific profile (e.g. students for a pilot program or patients for testing a very expensive new drug). Compared to Bhattacharyya's genetic algorithm-based model (1999), our direct approach is less susceptible to overfitting and is rather simple, facilitating the real-life adoption. Whereas Bhattacharyya (1999) focuses on the prediction of a continuous dependent variable, we have studied a binary outcome. Both mailing-depth optimized models are limited to objective functions assuming a linear relationship between the predictors and the dependent. Therefore, a promising avenue for further research exists in the elaboration of the direct approach to non-linear representations. Maybe genetic programming or neural networks could be employed for the latter, however, in combination with an ensemble method to reduce overfitting. Our WML-based approach as well as Bhattacharyya's genetic algorithm-based approach (1999) are limited by the optimization in knowledge of only one constraint. Future research should also consider the optimization of an outcome tailored to an implementation environment characterized by several constraints. Finally, this study clearly advocates future DM models in awareness of constraints and hence promotes constrained optimization