دانلود مقاله ISI انگلیسی شماره 79059
ترجمه فارسی عنوان مقاله

یک الگوریتم خوشه بندی agglomerative با استفاده از یک لیست K-نزدیکترین-همسایه پویا

عنوان انگلیسی
An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
79059 2011 13 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Information Sciences, Volume 181, Issue 9, 1 May 2011, Pages 1722–1734

ترجمه کلمات کلیدی
نزدیکترین همسایه؛ خوشه بندی Agglomerative؛ کوانتیزاسیون برداری
کلمات کلیدی انگلیسی
Nearest neighbor; Agglomerative clustering; Vector quantization
پیش نمایش مقاله
پیش نمایش مقاله  یک الگوریتم خوشه بندی agglomerative با استفاده از یک لیست K-نزدیکترین-همسایه پویا

چکیده انگلیسی

In this paper, a new algorithm is developed to reduce the computational complexity of Ward’s method. The proposed approach uses a dynamic k-nearest-neighbor list to avoid the determination of a cluster’s nearest neighbor at some steps of the cluster merge. Double linked algorithm (DLA) can significantly reduce the computing time of the fast pairwise nearest neighbor (FPNN) algorithm by obtaining an approximate solution of hierarchical agglomerative clustering. In this paper, we propose a method to resolve the problem of a non-optimal solution for DLA while keeping the corresponding advantage of low computational complexity. The computational complexity of the proposed method DKNNA + FS (dynamic k-nearest-neighbor algorithm with a fast search) in terms of the number of distance calculations is O(N2), where N is the number of data points. Compared to FPNN with a fast search (FPNN + FS), the proposed method using the same fast search algorithm (DKNNA + FS) can reduce the computing time by a factor of 1.90–2.18 for the data set from a real image. In comparison with FPNN + FS, DKNNA + FS can reduce the computing time by a factor of 1.92–2.02 using the data set generated from three images. Compared to DLA with a fast search (DLA + FS), DKNNA + FS can decrease the average mean square error by 1.26% for the same data set.