دانلود مقاله ISI انگلیسی شماره 79009
ترجمه فارسی عنوان مقاله

الگوریتم خوشه بندی مبتنی بر محدودیت مجتمع برای اطلاعات با ابعاد بالا ☆

عنوان انگلیسی
Integrated constraint based clustering algorithm for high dimensional data ☆
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
79009 2014 8 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Neurocomputing, Volume 142, 22 October 2014, Pages 478–485

ترجمه کلمات کلیدی
اطلاعات با ابعاد بالا؛ خوشه بندی صفحات - خوشه بندی بر اساس محدودیت
کلمات کلیدی انگلیسی
High dimensional data; Subspace clustering; Constraint based clustering
پیش نمایش مقاله
پیش نمایش مقاله  الگوریتم خوشه بندی مبتنی بر محدودیت مجتمع برای اطلاعات با ابعاد بالا ☆

چکیده انگلیسی

Dimension selection, dimension weighting and data assignment are three circular dependent essential tasks for high dimensional data clustering and each such task is challenging. To meet the challenge of high dimensional data clustering, constraints have been employed in several previous works. However, these constraint based algorithms use constraints to help accomplish only one of the three essential tasks. In this paper, we propose an integrated constraint based clustering (ICBC) algorithm for high dimensional data, which exploits constraints to accomplish all the three essential tasks. Firstly we generalize the dimension selection technique of CDCDD algorithm such that dimension selection and dimension weighting could be accomplished simultaneously. Then we propose a novel constraint based data assignment method which assigns all the data points to their corresponding clusters based on the selected dimensions and dimension weights. Finally we use an optimization technique to iteratively refine the initial dimension weights and centroids, and reassign data accordingly till convergence. Experimental results on both synthetic data sets and real data sets show that our proposed ICBC algorithm outperforms typical unsupervised algorithms and other constraint based algorithms in terms of accuracy. ICBC also outperforms the other algorithms that implement dimension selection in terms of efficiency and scalability.