دانلود مقاله ISI انگلیسی شماره 46693
ترجمه فارسی عنوان مقاله

معیارهای متعدد روش آموزش فعال برای رگرسیون بردار پشتیبانی

عنوان انگلیسی
A multiple criteria active learning method for support vector regression
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
46693 2014 10 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Pattern Recognition, Volume 47, Issue 7, July 2014, Pages 2558–2567

ترجمه کلمات کلیدی
رگرسیون - برآورد پارامترها - یادگیری فعال - رگرسیون بردار پشتیبانی
کلمات کلیدی انگلیسی
Regression; Parameters estimation; Active learning; Support vector regression
پیش نمایش مقاله
پیش نمایش مقاله  معیارهای متعدد روش آموزش فعال برای رگرسیون بردار پشتیبانی

چکیده انگلیسی

This paper presents a novel active learning method developed in the framework of ε-insensitive support vector regression (SVR) for the solution of regression problems with small size initial training data. The proposed active learning method selects iteratively the most informative as well as representative unlabeled samples to be included in the training set by jointly evaluating three criteria: (i) relevancy, (ii) diversity, and (iii) density of samples. All three criteria are implemented according to the SVR properties and are applied in two clustering-based consecutive steps. In the first step, a novel measure to select the most relevant samples that have high probability to be located either outside or on the boundary of the ε-tube of SVR is defined. To this end, initially a clustering method is applied to all unlabeled samples together with the training samples that are inside the ε-tube (those that are not support vectors, i.e., non-SVs); then the clusters with non-SVs are eliminated. The unlabeled samples in the remaining clusters are considered as the most relevant patterns. In the second step, a novel measure to select diverse samples among the relevant patterns from the high density regions in the feature space is defined to better model the SVR learning function. To this end, initially clusters with the highest density of samples are chosen to identify the highest density regions in the feature space. Then, the sample from each selected cluster that is associated with the portion of feature space having the highest density (i.e., the most representative of the underlying distribution of samples contained in the related cluster) is selected to be included in the training set. In this way diverse samples taken from high density regions are efficiently identified. Experimental results obtained on four different data sets show the robustness of the proposed technique particularly when a small-size initial training set are available.