چارچوب مبتنی بر ACS برای داده کاوی فازی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|22167||2009||9 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 36, Issue 9, November 2009, Pages 11844–11852
Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.
Data mining is most commonly used in attempts to induce association rules from transaction data. An association rule is an expression View the MathML sourceX→Y, where X is a set of items and Y is a single item ( Agrawal & Srikant, 1994). It means in the set of transactions, if all the items in X exist in a transaction, then Y is also in the transaction with a high probability. For example, assume whenever customers in a supermarket buy bread and butter, they will also buy milk. From the transactions kept in the supermarkets, an association rule such as “Bread and Butter →→ Milk” will be mined out. Most previous studies focused on binary-valued transaction data. Transaction data in real-world applications, however, usually consist of quantitative values. Designing a sophisticated data mining algorithm able to deal with various types of data presents a challenge to workers in this research field. Recently, the fuzzy set theory has been used more and more frequently in intelligent systems because of its simplicity and similarity to human reasoning (Kandel, 1992). The theory has been applied in fields such as manufacturing, engineering, diagnosis, and economics. Several fuzzy learning algorithms for inducing rules from given sets of data have been designed and used to good effect with specific domains. As to fuzzy data mining, Hong, Kuo, and Wang (2004) proposed a mining approach that integrated fuzzy set concepts with the apriori mining algorithm to find fuzzy interesting itemsets and association rules in quantitative transaction data. In that approach, the memberships functions used for fuzzy data mining have to be defined in advance. In (Hong, Chen, Wu, & Lee, 2006), a GA-based fuzzy data mining method for extracting both association rules and membership functions from quantitative transactions was thus proposed. The proposed GA-based method was divided into two phases: mining membership functions and mining fuzzy association rules. In the phase of mining membership functions, GA is used to derive the membership functions suitable for mining problems. In the phase of mining fuzzy association rules, the best membership functions derived by genetic algorithms are used to fuzzify the quantitative transactions. Then a fuzzy mining approach proposed by Hong, Kuo, and Chi (1999) can be used to find fuzzy association rules. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. They are inspired from the behavior of social insects and are s heuristic approach. Ants deposit their chemical trails called “pheromone” on the ground for communicating with others. According to the pheromone, ants can find the shortest path between the source and the destination. The characteristics of an ant colony include positive feedback and distributed computation. It also uses a constructive greedy heuristic (Kuo, Chiu, & Lin, 2004) to search for solutions. The research about data mining based on the ant colony system is still rare. Previous works on ACS-based rule discovery were proposed by Parpinelli et al., 2001 and Cordon and Herrera, 2002, in which they proposed the mining of classification rules for fuzzy control systems. Very few other researches explore the association rules. Therefore, in this work, we propose an ACS-based framework to extract membership functions from quantitative data for fuzzy data mining. Numerical experiments on the proposed algorithm are also performed to show its effectiveness. The remaining parts of the paper are organized as follows. Section 2 reviews ACS and fuzzy data mining. An ACS-based mining framework is then presented in Section 3. The details about how to use ACS on fuzzy data mining are explained in Section 4. The proposed algorithm based on the above framework is described in Section 5. An example demonstrating the proposed algorithm is given in Section 6. Numerical simulations are shown in Section 7. Conclusion and future work are given in Section 8.
نتیجه گیری انگلیسی
In this paper, we have looked into the issues of applying the ACS algorithm to extract membership functions for fuzzy data mining and have proposed an algorithm to achieve the purpose. An example is also given to demonstrate the proposed algorithm and numerical experiments are made to show the performance of the proposed algorithm. Experimental results show that it can get more knowledge amount than GA and than the uniform partition. However, more work needs to be done in the future. For example, the design of other heuristic functions in state transition and the definition of different fitness values may be further studied.