روش های جدید مبتنی بر اسکلت برای یادگیری ساختار بیزی از شبکه های بیزی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|29197||2013||11 صفحه PDF||سفارش دهید||8626 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Applied Soft Computing, Volume 13, Issue 2, February 2013, Pages 1110–1120
Automatically learning the graph structure of a single Bayesian network (BN) which accurately represents the underlying multivariate probability distribution of a collection of random variables is a challenging task. But obtaining a Bayesian solution to this problem based on computing the posterior probability of the presence of any edge or any directed path between two variables or any other structural feature is a much more involved problem, since it requires averaging over all the possible graph structures. For the former problem, recent advances have shown that search + score approaches find much more accurate structures if the search is constrained by a previously inferred skeleton (i.e. a relaxed structure with undirected edges which can be inferred using local search based methods). Based on similar ideas, we propose two novel skeleton-based approaches to approximate a Bayesian solution to the BN learning problem: a new stochastic search which tries to find directed acyclic graph (DAG) structures with a non-negligible score; and a new Markov chain Monte Carlo method over the DAG space. These two approaches are based on the same idea. In a first step, both employ a previously given skeleton and build a Bayesian solution constrained by this skeleton. In a second step, using the preliminary solution, they try to obtain a new Bayesian approximation but this time in an unconstrained graph space, which is the final outcome of the methods. As shown in the experimental evaluation, this new approach strongly boosts the performance of these two standard techniques proving that the idea of employing a skeleton to constrain the model space is also a successful strategy for performing Bayesian structure learning of BNs.
Bayesian networks (BN) are statistical models which allow to graphically represent, by means of a directed acyclic graph (DAG), complex structures of dependencies among stochastic variables . They have been widely employed in a great variety of real world problems because of their excellent properties for reasoning under uncertainty . The problem of automatic learning of the structure of a BN from a database has been the subject of a great deal of research ,  and . Traditionally, two procedures have been considered for this problem: one based on scoring and searching  and ; and other, constraint based (CB) learning, based on carrying out several independence tests on the learning data set to build a Bayesian network in agreement with tests results . However, in the past years several methods have been proposed which combine aspects of both basic procedures. For example  and , employ Bayesian scores to carry out the statistical tests in a PC-like algorithm. They showed that these scores reduce the average number of structural errors. Other works  and  introduce some PC variants which use a greedy procedure to introduce links, similar to the K2 algorithm , in order to make independent tests only relative to the added links. But the most successful strategy has been based on the induction of a BN skeleton (i.e. a graph with undirected edges) by means of CB approaches and, in a second step, run a score + search method to find a maximum scoring structure over the DAG space constrained by this skeleton. To the best of our knowledge, Van Dijk et al.  were the first ones to propose a learning method based on this strategy. But Max Min-Hill Climbing (MM-HC)  is probably the best-known method of this kind of approaches. The idea of finding high scoring BNs in a restricted or constrained search space has been further pursued in many works , ,  and . For example , presents a method which is able to find the BN model constrained to a given skeleton with the global optimal score. At the same time, many other outperforming methods have been proposed in order to elicit the skeleton of a BN by means of local structure discovery methods (see  for a recent review). In parallel to these works, researchers have also focused on the computation of full Bayesian solutions for the problem of learning BNs from data. They were motivated by the uncertainty in the model selection which is especially notorious in problems where the size of the learning data sets is usually small compared to the super-exponential size of the space of DAGs. Rather than look for a single BN maximizing a given score, like score + search approaches, the problem now is computing the marginal posterior probability of the edges of a BN  (or any other structural feature) and making predictions using a Bayesian model averaging approach. Two different sets of approaches have been employed to obtain these Bayesian solutions. Markov chain Monte Carlo (MCMC) methods , , ,  and , whose aim is to sample different DAG structures according to the stationary distribution of the defined Markov chain; and stochastic search methods ,  and  which, unlike MCMC, do not have the goal of converging to a stationary distribution (which may be unattainable, and is usually impossible to assess), but simply listing and scoring a collection of high scoring models which are visited during the search process. In this work we aim to show that the successful Max Min-Hill Climbing decomposition idea  for finding maximum scoring single BNs can also be applied to develop new methods for obtaining Bayesian solutions to the BN structure learning problem. More precisely, what we show is that the computation of a Bayesian approximation in two steps, firstly over the DAGs constrained by a previously given skeleton and, in a second step, over an unconstrained DAG space but using the information of the first step, is a very successful strategy for this problem. Two different implementations of this idea are evaluated in this work. The first one is a stochastic search method which, as first step, carries out a search constrained by the skeleton performing random movements according to the Bayesian scores of the alternative local movements. In the second step, this stochastic search is run again, starting randomly at any of the high scoring models found in the first step and without constraining the random local moves. The second method introduced in this work is a new MCMC in the DAG space. Similarly to the other method, we firstly run a MCMC constrained by the skeleton of the BN which should identify high scoring constrained DAG models. After that, in a second step, a standard MCMC over the DAG space is run without any constraint. However, this MCMC is conducted by a new proposal distribution, which introduces global movements using the samples generated by the first MCMC over the constrained DAG space. This paper is structured as follows. In Section 2, we provide background knowledge on Bayesian scores and skeletons, as well as on Bayesian structure learning methods. Subsequently, in Section 3, we present the details of our two skeleton-based approaches. The experimental evaluation is given in Section 4. Finally, in Section 5, we provide the main conclusions and future research.
نتیجه گیری انگلیسی
In this work we have presented novel skeleton-based approaches for Bayesian structure learning of BNs. These approaches are supported by standard techniques to address this problem, such as stochastic search or Metropolis–Hasting methods. They start with the inference of a graph skeleton using a specific method. With this skeleton, they decompose the Bayesian approximation problem in two simpler problems. The first one consists in computing the Bayesian approximation over the space of DAGs constrained by this skeleton. This problem is simpler because the model space is much smaller. Then, the Bayesian solution is inferred over the unconstrained DAG space but using the previous approximation that, as shown in the experimental evaluation, makes it easier for the approximation techniques (stochastic search and the MCMC method) to get higher quality solutions. In consequence, we show that this idea, previously employed by algorithm MM-HC to find high quality single DAG models , can also be extended to perform Bayesian structure learning. Future research will be focused on the application of this methodology for performing Bayesian structure learning on high dimensional problems (problems with several hundreds of variables). Other lines of research will also be followed to improve the performance of our skeleton-based methods. For example, the edge reversal move employed by  for improving MCMC methods in the DAG space can be also integrated in SK-MCMC. Following the ideas pursued in , we also plan to develop new approaches based on dividing the graph skeleton in smaller graphs involving a lower number of nodes, computing a Bayesian approximation for each of them, and then using all of these approximations to build a better proposal distribution. The integration of expert/domain knowledge in order to refine the inferred models will be also considered in future work.