ترجمه فارسی عنوان مقاله

استخراج ویژگی های معنی دار اساسی و لغو سر و صدا با استفاده از تجزیه و تحلیل مولفه های مستقل برای بازاریابی مستقیم

عنوان انگلیسی

Extracting underlying meaningful features and canceling noise using independent component analysis for direct marketing

کد مقاله	سال انتشار	تعداد صفحات مقاله انگلیسی
23575	2007	11 صفحه PDF

منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 33, Issue 1, July 2007, Pages 181–191

ترجمه کلمات کلیدی

- تجزیه و تحلیل مولفه های مستقل - تجزیه و تحلیل مؤلفه های اصلی - شبکه عصبی مصنوعی - بازاریابی مستقیم - مدیریت ارتباط با مشتری

کلمات کلیدی انگلیسی

Independent component analysis,Principal component analysis,Artificial neural networks,Direct marketing,Customer relationship management

دانلود رایگان 2 صفحه اول مقاله لاتین (PDF)

پیش نمایش مقاله

چکیده انگلیسی

As the Internet spreads widely, it has become easier for companies to obtain and utilize valuable information on their customers. Nevertheless, many of them have difficulty in using the information effectively because of the huge amount of data from their customers that must to be analyzed. In addition, the data usually contains much noise due to anonymity of the Internet. Consequently, extracting the underlying meanings and canceling the noise of the collected customer data are crucial for the companies to implement their strategies for customer relationship management. As a novel solution, we propose the use of independent component analysis (ICA). ICA is a multivariate statistical tool which extracts independent components or sources of information, given only observed data that are assumed to be linear mixtures of some unknown sources. Moreover, ICA is able to reduce the dimension of the observed data, especially noisy variables. To validate the usefulness of ICA, we applied it to a real-world one-to-one marketing case. In this study, we used ICA as a preprocessing tool, and made a prediction for potential buyers using artificial neural networks (ANNs). We also applied PCA as a comparative model for ICA. The experimental results showed that ICA-preprocessed ANN outperformed all the comparative classifiers without preprocessing as well as PCA-preprocessed ANN.

مقدمه انگلیسی

The Internet has changed the environment for management dramatically in many ways. Especially, the Internet has provided various ways for companies to build communication channels and relationships with their customers. Recently, new technologies such as WWW (World Wide Web), DW (Data Warehouse), and mobile communications have made it easier to collect a huge amount of data on customers and to provide personalized cues to attract them to make a purchase. Thus, in order to be a winner in severe competition with other competitors, the companies should deeply understand their customers and provide more sophisticated products or services by using up-to-date information technologies. However, it is never easy to implement a system that facilitates a deep understanding of customers in a real-world business, although they are already equipped with state-of-art information technologies. There are two main reasons. The first reason is that there is too much information (i.e. information overload) on customers, so it is very hard to understand all the information in-depth. Moreover, most of the collected information is usually a record of simple observations although the useful information for the companies is more than just observations, but their underlying meaning. In particular, it is generally almost impossible to interpret the observations to find the underlying meanings without the help of human experts. The second reason is the fact that the data collected in an online environment usually contains much noise. When considering the principle of GIGO – Garbage In, Garbage Out – the noisy data set may lead the companies’ strategies for their customers in a wrong direction. Consequently, extracting the underlying meaning from the observed data without prior knowledge (sometimes called ‘blind source separation’ in other engineering areas) and canceling noise from the collected data have been very important issues in data-driven marketing approaches. This study proposes a novel technique for overcoming these obstacles – independent component analysis (ICA). ICA is a recently developed method which finds a linear representation of non-Gaussian data whose components are statistically independent (Hyvärinen & Oja, 2000). It is quite similar to principal component analysis (PCA), and PCA and its variations have been applied for similar purposes until now (Casarotto et al., 2004, Jutten and Herault, 1991, Karhunen et al., 1998, Kim and Yum, 2005, Stetter et al., 2000 and Zhu et al., 2005). However, ICA uses a higher-order statistical method while PCA uses a second-order method. Thus, more useful information may be extracted from the data in ICA than PCA (Back and Weigend, 1997, Cao et al., 2003, Du et al., 2004, Fragos et al., 2003, Katsumata and Matsuyama, 2005 and Kermit and Tomic, 2003). As a result, ICA has been applied to a number of application cases including motion and image identification (Du et al., 2004 and Katsumata and Matsuyama, 2005), demand estimation (Liao & Niebur, 2003), financial data analysis (Back and Weigend, 1997, Cao et al., 2003, Kiviluoto and Oja, 1998 and Wu and Yu, 2005) and behavioral prediction (Fragos et al., 2003). However, there have been few studies that use ICA as a preprocessing tool of customer data for customer relationship management (CRM). In this study, we propose ICA as a method to extract underlying meaningful information from collected customer data without prior knowledge and also to remove useless noisy variables. To validate the usefulness of ICA, we applied it to a real-world one-to-one marketing case. In addition, we applied PCA to the same data as a comparative method to confirm the superiority of ICA compared to PCA. After these two techniques transformed the original data sets into the preprocessed ones, we used them as inputs for prediction models, artificial neural networks (ANNs). Finally, we compared the prediction accuracy of each method. The article is organized as follows. In Section 2, we provide a brief explanation on the theoretical background of ICA and PCA. In the next section, ICA – the proposed method – will be explained in-depth. Section 4 presents the research design and experiments. In the fifth section, the experimental results are presented and discussed. The final section suggests the contributions and limitations of this study.

نتیجه گیری انگلیسی

In this study, we propose ICA as a preprocessing tool for the data sets which are collected from customers for CRM. Usually, ICA enables two functions. One is an extraction of the underlying meaningful information from the observed data set, and the other is cancellation of noisy variables. These functions may be important especially in the CRM area under the Internet environment (generally called e-CRM). The Internet has provided companies with infinite opportunities to collect a huge amount of data on their customers. However, it is very difficult to extract meaningful underlying information from such a huge data set. Furthermore, it becomes more a difficult task without prior knowledge or expertise. For example, BMI (Body Mass Index) – an index that can be simply calculated by using height and weight – is one of the important variables for characterizing diet customers, but it is never easy for novices or automated computer systems to construct this important feature without any prior knowledge about dieting. So, ICA, which produces meaningful underlying features from the collected and observed data set without any expertise, can be very useful in e-CRM domain. Also, the Internet is the media that ensures anonymity of the users, so it is usual that the collected data from the Internet contains much noise. Thus, the noise reducing function of ICA is also important in e-CRM domain. As shown in our experimental results, applying preprocessing techniques such as PCA and ICA seems to be very effective in building a prediction model in CRM. Furthermore, among the preprocessing techniques, ICA, which uses a higher-order statistical method, may outperform PCA which uses a second-order method for non-Gaussian data sets. Although there are many studies that apply ICA as a preprocessing tool in computer science or engineering, there have been few approaches that use ICA in management areas except for stock market prediction. So, we expect our study will be recognized for its pioneering effort to stimulate the use of ICA as a preprocessing tool for management issues, especially for CRM. In future studies, the usefulness and generalizability of ICA should be validated in other applications of management domains. In addition, in the future, effort should be made to interpret the meaning of the extracted features by ICA.