دانلود مقاله ISI انگلیسی شماره 151342
ترجمه فارسی عنوان مقاله

یک روش برنامه نویسی ژنتیک برای انتخاب ویژگی در داده های متناقض بسیار بعدی

عنوان انگلیسی
A Genetic Programming approach for feature selection in highly dimensional skewed data
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
151342 2018 34 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Neurocomputing, Volume 273, 17 January 2018, Pages 554-569

ترجمه کلمات کلیدی
انتخاب ویژگی، طبقه بندی، برنامه ریزی ژنتیک،
کلمات کلیدی انگلیسی
Feature selection; Classification; Genetic Programming;
پیش نمایش مقاله
پیش نمایش مقاله  یک روش برنامه نویسی ژنتیک برای انتخاب ویژگی در داده های متناقض بسیار بعدی

چکیده انگلیسی

High dimensionality, also known as the curse of dimensionality, is still a major challenge for automatic classification solutions. Accordingly, several feature selection (FS) strategies have been proposed for dimensionality reduction over the years. However, they potentially perform poorly in face of unbalanced data. In this work, we propose a novel feature selection strategy based on Genetic Programming, which is resilient to data skewness issues, in other words, it works well with both, balanced and unbalanced data. The proposed strategy aims at combining the most discriminative feature sets selected by distinct feature selection metrics in order to obtain a more effective and impartial set of the most discriminative features, departing from the hypothesis that distinct feature selection metrics produce different (and potentially complementary) feature space projections. We evaluated our proposal in biological and textual datasets. Our experimental results show that our proposed solution not only increases the efficiency of the learning process, reducing up to 83% the size of the data space, but also significantly increases its effectiveness in some scenarios.