دانلود مقاله ISI انگلیسی شماره 22034
ترجمه فارسی عنوان مقاله

عطف جزئیات تماس ارتباطات از راه دور با پیش بینی فعال : یک رویکرد داده کاوی

عنوان انگلیسی
Turning telecommunications call details to churn prediction: a data mining approach
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
22034 2002 10 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 23, Issue 2, August 2002, Pages 103–112

ترجمه کلمات کلیدی
داده کاوی - ارتباطات داده کاوی - پیش بینی فعال شده - مدیریت فعال شده - تجزیه و تحلیل طبقه بندی - تصمیم گیری القای درختی -
کلمات کلیدی انگلیسی
Data mining, Telecommunications data mining, Churn prediction, Churn management, Classification analysis, Decision tree induction,
پیش نمایش مقاله
پیش نمایش مقاله  عطف جزئیات تماس ارتباطات از راه دور با پیش بینی فعال : یک رویکرد داده کاوی

چکیده انگلیسی

As deregulation, new technologies, and new competitors open up the mobile telecommunications industry, churn prediction and management has become of great concern to mobile service providers. A mobile service provider wishing to retain its subscribers needs to be able to predict which of them may be at-risk of changing services and will make those subscribers the focus of customer retention efforts. In response to the limitations of existing churn-prediction systems and the unavailability of customer demographics in the mobile telecommunications provider investigated, we propose, design, and experimentally evaluate a churn-prediction technique that predicts churning from subscriber contractual information and call pattern changes extracted from call details. This proposed technique is capable of identifying potential churners at the contract level for a specific prediction time-period. In addition, the proposed technique incorporates the multi-classifier class-combiner approach to address the challenge of a highly skewed class distribution between churners and non-churners. The empirical evaluation results suggest that the proposed call-behavior-based churn-prediction technique exhibits satisfactory predictive effectiveness when more recent call details are employed for the churn prediction model construction. Furthermore, the proposed technique is able to demonstrate satisfactory or reasonable predictive power within the one-month interval between model construction and churn prediction. Using a previous demographics-based churn-prediction system as a reference, the lift factors attained by our proposed technique appear largely satisfactory.

مقدمه انگلیسی

As deregulation, new technologies, and new competitors have opened up the telecommunications industry, the telecommunications service market has become more competitive than ever (Gerpott et al., 2001 and Kappert and Omta, 1997). To survive or maintain an advantage in an ever-increasing competitive marketplace, many companies are turning to data mining techniques to address such challenging issues as fraud detection (Burge and Shawe-Taylor, 1997, Cox et al., 1997, Ezawa and Norton, 1996 and Taniguchi et al., 1998), prospect profiling (Kappert and Omta, 1997 and Wei et al., 2000), churn prediction and management (Berson, Smith, & Thearling, 2000), etc. Churn prediction and management is a concern for many industries, but it is particularly acute in the strongly competitive and now broadly liberalized mobile telecommunications industry. Subscriber churning (often referred to as customer attrition in other industries) in mobile telecommunications refers to the movement of subscribers from one provider to another. Many subscribers frequently churn from one provider to another in search of better rates/services or for the benefits of signing up with a new carrier (e.g. such as receiving the latest cellular phone). It is estimated that the average churn rate for the mobile telecommunications is 2.2% per month (Berson et al., 2000). That is, about 27% of a given carrier's subscribers are lost each year, making it essential to develop an effective churn-reduction method. The cost of acquisition of a new mobile service subscriber is estimated to be from $300 to $600 in sales support, marketing, advertising, and commissions (Berson et al., 2000 and SPSS, 1999). However, the cost of retaining an existing subscriber is generally much lower than that. On the other hand, existing subscribers tend to generate more cash flow and profit, since they are less sensitive to price and often lead to sales referrals (Eiben, Euverman, Kowalczyk, & Slisser, 1998). Due to the high cost of acquiring new subscribers and considerable benefits of retaining existing ones, building a churn prediction model to facilitate subsequent churn management and customer retention is critical for the success or bottom-line survival of a mobile telecommunications carrier in this greatly compressed market-space. Data mining refers to a process of extracting previously unknown, valid and actionable patterns or knowledge from large databases for crucial business decision support (Berry and Linoff, 1997, Cabena et al., 1998, Chen et al., 1996 and Frawley et al., 1991). Based on the kinds of knowledge which can be discovered in databases, data mining techniques can be broadly classified into several categories, including classification, clustering, dependency analysis, data visualization, and text mining (Shaw, Subramaniam, Tan, & Welge, 2001). Classification analysis is a process that induces a model to categorize a set of pre-classified instances (called training examples) into classes. Such a classification model is then used to classify future instances. Widely adopted classification techniques include decision tree induction, decision rule induction, and neural network. Clustering analysis is a process whereby a set of instances (without a predefined class attribute) is partitioned (or grouped) according to some distance metric into several clusters in which all instances in one cluster are similar to each other and different from the instances of other clusters. Dependency analysis discovers dependency patterns (e.g. association rules, sequential patterns, temporal patterns, and episode rules) embedded in data. Data visualization allows decision makers to view complex patterns in the data as visual objects in three dimensions and color; it supports advanced manipulation capabilities to slice, rotate or zoom the objects to provide varying levels of details of the patterns observed. Finally, text mining (including text categorization, document clustering, term association discovery, information extraction, etc.) extracts patterns from textual documents and can be applied to facilitate document management and retrieval or to discover knowledge hidden in texts. Past research on churn prediction in the telecommunications industry mainly employed classification analysis techniques for the construction of churn prediction models, using as predictors (i.e. input variables) user demographics, contractual data, customer service logs and/or call patterns aggregated from call details (e.g. average call duration, number of outgoing calls, etc.). For example, the classification and regression trees (CART) algorithm (Breiman, Friedman, Olshen, & Stone, 1984) was employed for churn prediction based on customer demographics and contractual data (e.g. length of service, contract type, etc.) as well as customer service logs from the customer service center that captured inbound calls from the customers (Berson et al., 2000). However, existing churn-prediction systems have several disadvantages. First, use of customer demographics in churn prediction renders the resulting churn analysis at the customer rather than contract (or subscriber) level. In other words, propensities of churning are calculated on a per customer rather than contract basis. It is quite common that a customer concurrently holds several mobile service contracts with a particular carrier, with some contracts more likely to be churned than others. In this regard, customer-level-based churn prediction is considered inappropriate. Second, information on some of the input variables employed by existing churn-prediction systems frequently are not readily available. For example, the mobile telecommunications company investigated in this study has very limited customer demographic information (e.g. only the name, date of birth, identification number, and billing address were collected for each subscriber). Unavailability of customer profiles, prevailing in most telecommunications companies in many countries, limits the applicability of existing churn-prediction systems. In response to the described limitations of existing churn-prediction systems, we exploit the use of call pattern changes and contractual data for developing a churn-prediction technique that identifies potential churners at the contract level. Conceivably, subscriber churn is not an instantaneous occurrence that leaves no trace. Before an existing subscriber churns, his/her call patterns might be changed (e.g. the number of outgoing calls gradually get reduced). In other words, changes in call patterns are likely to include warning signals pointing toward churning. Such call pattern changes can be extracted from subscribers' call details and are valuable for constructing a churn prediction model based on a classification analysis technique. The remainder of the paper is organized as follows, Section 2 details the data and variables used for the target churn prediction problem. Section 3 depicts the proposed churn-prediction technique, using a decision tree induction algorithm for learning. Section 4 describes the evaluation design and discusses important experimental results. This paper is concluded in Section 5 with a summary, discussion of its contributions and limitations, and some future research directions.

نتیجه گیری انگلیسی

Churn prediction and management is critical in the fast changing, strongly competitive and now broadly liberalized mobile communications market. To be able to improve customer retention, a mobile telecommunications service provider has to be able to predict at-risk subscribers on whom the subsequent customer retention effort is focused. In response to the limitations of existing churn-prediction systems and the unavailability of customer demographics in the mobile telecommunications provider studied, we proposed, designed, and experimentally evaluated a churn-prediction technique using as predictors the contractual information of subscribers and their call pattern changes extracted from the call details. The proposed technique is capable of identifying potential churners at the contract level for a specific prediction period. In addition, the proposed technique adopted the multi-classifier class-combiner approach to address the challenge of highly skewed class distribution between churners and non-churners. The empirical evaluation results suggested that multi-classifier class-combiner approach outperformed the single-classifier approach. The proposed call-behavior-based churn-prediction technique exhibited satisfactory predictive effectiveness when more recent call details were employed for the churn prediction model construction. Furthermore, the proposed technique was able to demonstrate satisfactory or reasonable predictive power within a one-month interval between model construction and churn prediction. Using a prior demographics-based churn-prediction system as a reference, the lift factors attained by our proposed technique were considered largely satisfactory. This study benefits not only churn prediction research and practice but also other data mining applications with identical or similar characteristics. Continuing research should be aligned toward furthering the effectiveness and generalizability of the proposed technique. Some ongoing and future research directions are briefly summarized as follows. First, this study employed only the contractual data and call details for target churn prediction. The inclusion of additional input variables (e.g. extracting from customer complains and service logs) into our proposed technique might further enhance its predictive effectiveness. Second, subscribers in different geographical locations may exhibit dissimilar call behaviors. Thus, performing empirical evaluations of the developed technique in different geographical locations represents an interesting direction for further research. Third, as mentioned, constant re-learning or re-discovery of a churn prediction model is required due to the evolving nature of subscribes' call behavior that, at the same time, is likely to be susceptible to events in the mobile telecommunications service market. The provision of a subscriber-centric data warehouse would be desirable supporting the described knowledge maintenance requirement. In addition, churning is not restricted to the telecommunications market and is also a great concern for those industries (e.g. credit card issuers and internet service providers) where stiff competition provides incentives for customers to switch. Thus, expanding the developed technique to other industries suggests interesting directions for future research.