دانلود مقاله ISI انگلیسی شماره 150612
ترجمه فارسی عنوان مقاله

الگوریتم خوشه بندی مبتنی بر تراکم و سلسله مراتبی رمان برای داده های نامشخص

عنوان انگلیسی
Novel density-based and hierarchical density-based clustering algorithms for uncertain data
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
150612 2017 33 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Neural Networks, Volume 93, September 2017, Pages 240-255

ترجمه کلمات کلیدی
خوشه بندی داده های نامعلوم، الگوریتم مبتنی بر تراکم، الگوریتم مبتنی بر تراکم سلسله مراتبی،
کلمات کلیدی انگلیسی
Clustering; Uncertain data; Density-based algorithm; Hierarchical density-based algorithm;
پیش نمایش مقاله
پیش نمایش مقاله  الگوریتم خوشه بندی مبتنی بر تراکم و سلسله مراتبی رمان برای داده های نامشخص

چکیده انگلیسی

Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing algorithms in accuracy and efficiency.