الگوریتم خوشه بندی جریان متن بر اساس انتخاب از ویژگی های تطبیقی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|79078||2011||7 صفحه PDF||سفارش دهید||3807 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 38, Issue 3, March 2011, Pages 1393–1399
Text steam analysis is now of great importance and practical value today. It has several applications such as news group filtering, topic detection & tracking (TDT), user characterized recommendation etc. Clustering is one of the most important methods of analyzing text stream. However, most text stream clustering algorithms rarely consider the possible change of features during a long-time of clustering, which is usually the case, leading to unsatisfactory results of the clustering system. The paper mainly focuses on the problem of adaptive feature selection for clustering text stream. A validity index based method of adaptive feature selection is proposed, incorporating with which a new text stream clustering algorithm is developed. During the clustering process, threshold of cluster valid index is used to automatically trigger feature re-selection in order to ensure the validity of clustering. The experiment using Reuters-21578 text set as the text source shows that the clustering algorithm reaches reasonable results of high quality.