انجمن قوانین استخراج معادن از طریق یکپارچه سازی تجزیه و تحلیل خوشه ای و سیستم کلونی مورچه ها برای پایگاه داده بیمه سلامت در تایوان
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24410||2007||15 صفحه PDF||سفارش دهید||8837 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 33, Issue 3, October 2007, Pages 794–808
In addition to sharing and applying the knowledge in the community, knowledge discovery has become an important issue in the knowledge economic era. Data mining plays an important role of knowledge discovery. Therefore, this study intends to propose a novel framework of data mining which clusters the data first and then followed by association rules mining. The first stage employs the ant system-based clustering algorithm (ASCA) and ant K-means (AK) to cluster the database, while the ant colony system-based association rules mining algorithm is applied to discover the useful rules for each group. The medical database provided by the National Health Insurance Bureau of Taiwan Government is used to verify the proposed method. The evaluation results showed that the proposed method not only is able to extract the rules much faster, but also can discover more important rules.
In recent years, there are dramatic changes in the human life, especially the information technology. It has become the essential part of our daily life. Its convenience let us more easily to store any kind of the information regarding science, medicine, finance, population statistics, marketing and so on. However, if there is not a useful method to help us apply these data, then they are only the garbage instead of resources. Due to such demand, there are more and more researchers who pay more attention on how to use the data effectively as well as efficiently. And this is so called data mining. Data mining includes many areas, in which there are databases techniques, artificial intelligence, machine learning, neural network, statistical techniques, pattern recognition, data visualization etc., is growing up very quickly. It is assigned an objective to find the hidden knowledge or information, which may be helpful to make decisions for business or policies, from large database automatically. Data mining can be classified into some topics, like classification, estimation, forecasting, clustering, association rule and sequential pattern (Peacock Peter, 1998). Among them, this study intends to propose a framework which integrates both the clustering analysis and association rules mining to discover the useful rules from the database through ant colony optimization system. Therefore, the proposed method is consisted of two components: (1) clustering analysis and (2) association rules mining. The first stage employs the ant system-based clustering algorithm (ASCA) and ant K-means (AK) to cluster the database, while the ant colony system-based association rules mining algorithm is applied to discover the useful rules for each group. The reason to clustering the database first is that this can dramatically decrease the mining time. In order to assess the proposed method, a database being provided by the National Health Insurance Plan of Taiwan Government is applied. This database has accumulated 12 millions administrative and claims data, which is the largest database in the world. Basically, this work is a cooperation of National Health Research Institute with the National Health Insurance Bureau of Taiwan Government in order to establish a Nation Health Insurance research database. The computational results show that the proposed method not only can extract the useful rules faster, but also can provide more precise rules for the medical doctors. The rest of this paper is organized as follows. Section 2 summarizes some general background for data mining, clustering analysis, association rule and ant colony optimization system, and the proposed method is presented in Section 3. The result of real world data with the proposed method is illustrated in Section 4. Finally, concluding remarks are made in Section 5.
نتیجه گیری انگلیسی
In the early of 21st century, the developing of science and technology lets the medicine be prosperous and makes a huge change for the environments. Thus, it is quite difficult to predict what will happen in the further. Especially there are more and more previously unknown diseases, like SARS and bird flu, which were found recently. As mentioned above, human beings have to fight with the germs more and more hardly. Therefore, developing a decision support system which is about patient treatments and extracting the important relationships or association rules between diseases has become a very critical issue. This also can provide another way, which is different from the medicine and biology, to help diagnose the diseases for finding out the treatments. According to the above findings, this study has developed a method which is able to discover more useful and accurate rules from the medical database fast. In order to avoid the missing knowledge in dividing the data, we divide the medical database into several clusters by ant colony system and then mine the hidden knowledge from the clustered data also via ant colony system. This can not only let the researchers pay more attention on some important groups and find out the hidden relation in the groups easier, but also avoid the important relationship ignored in the large database. The evaluation results using National Health Insurance Database have shown the proposed method’s feasibility. Although the result in this study shows the promising application, there are some issues that should be further solved. Because this study just mines the relation between the ICD codes, it is suggested to add in the numerical data of medical examination and fuzzy the numerical data in preparation stage. In the clustering analysis stage, the proposed method utilized ASCA and AK to build up the cluster. Therefore, it may be desirable to apply other cluster method, like ART2, ADSOM or other two-stage methods to cluster the data. Besides, there many similar rules generated from the mining process, so it is feasible to apply other technology, such as the Fuzzy theorem, to merge the similar rules.