دانلود مقاله ISI انگلیسی شماره 2406
ترجمه فارسی عنوان مقاله

تاثیر پردازش بر داده کاوی : ارزیابی حساسیت طبقه بندی در بازاریابی مستقیم

عنوان انگلیسی
The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
2406 2006 20 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : European Journal of Operational Research, Volume 173, Issue 3, 16 September 2006, Pages 781–800

ترجمه کلمات کلیدی
- داده کاوی - شبکه های عصبی - پردازش داده ها - طبقه بندی - بازار یابی
کلمات کلیدی انگلیسی
پیش نمایش مقاله
پیش نمایش مقاله  تاثیر پردازش بر داده کاوی : ارزیابی حساسیت طبقه بندی در بازاریابی مستقیم

چکیده انگلیسی

Corporate data mining faces the challenge of systematic knowledge discovery in large data streams to support managerial decision making. While research in operations research, direct marketing and machine learning focuses on the analysis and design of data mining algorithms, the interaction of data mining with the preceding phase of data preprocessing has not been investigated in detail. This paper investigates the influence of different preprocessing techniques of attribute scaling, sampling, coding of categorical as well as coding of continuous attributes on the classifier performance of decision trees, neural networks and support vector machines. The impact of different preprocessing choices is assessed on a real world dataset from direct marketing using a multifactorial analysis of variance on various performance metrics and method parameterisations. Our case-based analysis provides empirical evidence that data preprocessing has a significant impact on predictive accuracy, with certain schemes proving inferior to competitive approaches. In addition, it is found that (1) selected methods prove almost as sensitive to different data representations as to method parameterisations, indicating the potential for increased performance through effective preprocessing; (2) the impact of preprocessing schemes varies by method, indicating different ‘best practice’ setups to facilitate superior results of a particular method; (3) algorithmic sensitivity towards preprocessing is consequently an important criterion in method evaluation and selection which needs to be considered together with traditional metrics of predictive power and computational efficiency in predictive data mining.