روش کاهش ابعاد بر اساس بهینه سازی کلونی مورچه برای مجموعه داده های ابعادی بالا
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|7867||2013||11 صفحه PDF||سفارش دهید||6483 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Bionic Engineering, Volume 10, Issue 2, April 2013, Pages 231–241
In this paper, a bionic optimization algorithm based dimension reduction method named Ant Colony Optimization -Selection (ACO-S) is proposed for high-dimensional datasets. Because microarray datasets comprise tens of thousands of features (genes), they are usually used to test the dimension reduction techniques. ACO-S consists of two stages in which two well-known ACO algorithms, namely ant system and ant colony system, are utilized to seek for genes, respectively. In the first stage, a modified ant system is used to filter the nonsignificant genes from high-dimensional space, and a number of promising genes are reserved in the next step. In the second stage, an improved ant colony system is applied to gene selection. In order to enhance the search ability of ACOs, we propose a method for calculating priori available heuristic information and design a fuzzy logic controller to dynamically adjust the number of ants in ant colony system. Furthermore, we devise another fuzzy logic controller to tune the parameter (q0) in ant colony system. We evaluate the performance of ACO-S on five microarray datasets, which have dimensions varying from 7129 to 12000. We also compare the performance of ACO-S with the results obtained from four existing well-known bionic optimization algorithms. The comparison results show that ACO-S has a notable ability to generate a gene subset with the smallest size and salient features while yielding high classification accuracy. The comparative results generated by ACO-S adopting different classifiers are also given. The proposed method is shown to be a promising and effective tool for mining high-dimension data and mobile robot navigation.
The advent of DNA microarray technology has provided not only the ability to measure the expression levels of thousands of genes simultaneously in a single experiment but also the possibility to identify diagnosis disease. Therefore, an overall understanding of the cell can be obtained. The gene expression data is very different from any of the data. First, it has a very high dimensionality, usually contains thousands to tens of thousands of genes. Second, publicly available data size is very small. Third, most genes are irrelevant to cancer distinction. As a result, existing classification methods turn out to be not efficient and effective to handle this kind of data[2–3] . The irrelevant gene expression data leads to a high computational complexity and makes it impossible to discover relevant genes. The reason of performing gene selection pri or to cancer classification is twofold. One is that performing gene selection can help reduce data size, and thus cutting down the running time. The other and more important one is that gene selection can eliminate a great number of irrelevant genes so as to improve the classification accuracy [4–5]. With the proliferation of high-dimensional data, Feature Selection (FS) has become an indispensable task of a learning process. FS aims to select a good subset of features from the original set of features without losing a suitably high accuracy in representing the original features, in which there exists abundance of noise, spurious information, and irrelevant and redundant feature.
نتیجه گیری انگلیسی
In this paper, we presented an efficient method for gene selection. The proposed framework consists of two stages, in which two ACOs, namely ant system and ant colony system, are utilized to select genes respectively. To further improve the search capability of ACOs, we explored fuzzy logic control theory to adjust the parameters in ACOs. In addition, we employed a new evaluation method to calculatepriori available heuristic information in ant system. The experimental results show that ACO-S enables to balance between explora- tion and exploitation, thus finding more important genes by taking advantage of the parameter adjustment and gene importance. As a result, our proposed method can not only select a feature subset of smallest size, but also achieve the best classification accuracy.