الگوریتم های بهبود برای یادگیری ساختار در شبکه های بیزی با استفاده از نمره ضمنی جدید
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|29022||2010||6 صفحه PDF||سفارش دهید||3587 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : http://www.sciencedirect.com/science/journal/09574174, Volume 37, Issue 7, July 2010, Pages 5470–5475
Learning Bayesian Network structure from database is an NP-hard problem and still one of the most exciting challenges in machine learning. Most of the widely used heuristics search for the (locally) optimal graphs by defining a score metric and employs a search strategy to identify the network structure having the maximum score. In this work, we propose a new score (named implicit score) based on the Implicit inference framework that we proposed earlier. We then implemented this score within the K2 and MWST algorithms for network structure learning. Performance of the new score metric was evaluated on a benchmark database (ASIA Network) and a biomedical database of breast cancer in comparison with traditional score metrics BIC and BD Mutual Information. We show that implicit score yields improved performance over other scores when used with the MWST algorithm and have similar performance when implemented within K2 algorithm.
Bayesian Network (BN) is one of the most popular models in data mining technologies. A BN is a Directed Acyclic Graph (DAG), where the nodes are random variables and where the arcs specify the independence assumptions between these variables. During the past few years, several algorithms have been developed for learning the structure of BN from a database (Neapolitan, 2003). An important class of such algorithms is known as the score metric-based methods. These methods first define quantity metric that measures the quality of the network structures or the conditional dependence among variables given the database; second, they apply a search procedure to identify a structure that maximizes the metric over the structure space. Many scores metrics have been proposed and employed with a heuristic search strategy to identify the network with the maximum score. The most used algorithm, named the K2 algorithm, was proposed by cooper and Hersovits (1992) with a Bayesian Dirichlet (BD) score metric. Other authors used this algorithm with other score metrics such as the likelihood-equivalence Bayesian Dirichlet metric (BDe) (Heckerman, Geiger, & Chickering, 1994) and the minimum description length (MDL) metric (Bouckaert, 1993, Lam and Bacchus, 1993 and Rissanen, 1978) showing an improvement in the performance. Recently, Yun and Keong (2004) proposed another formulation of the MDL score, which they called IMDL (Improved MDL) .The BIC score metric proposed by Akaike, 1973 and Akaike, 1979 has been also used and implemented in several computer packages. Recently, Chen (2008) developed a new structure-learning algorithm based on scoring search methods and information theory. These score metrics have been widely compared and used. Shulin and Chang (2002) compared the performance of five score metrics using the K2 algorithm. Acid and de Campos (2001) introduced a new score and assessed its performance within the K2 algorithm in comparison with other scores. In this work, we propose a new score metric that we named implicit score (IS). This score is based on the Implicit inference framework proposed by Hassairi, Masmoudi, and Kokonendji (2005) and applied to probability learning in Bayesian Networks by BenHassen, Masmoudi, and Rebai (2008). We implemented this score within the K2 and MWST algorithms and compared its performance to the traditional score metrics (BD, BIC and Mutual Information) available in most implementation. This paper is organized as follows: In Section 2, we introduce BNs and relevant concepts. In Section 3, we describe the Implicit method and the new proposed score. Section 4 presents the experimental results. Finally, Section 5 presents our conclusions.
نتیجه گیری انگلیسی
The use of the implicit score with standard widely used algorithms for structure learning in Bayesian Networks seem to improve performance and particularly of the MWST algorithm particularly when used with a small database. This promising feature of the new score should be subject to more intensive testing in order to validate its potential and to recommend its use.