دانلود مقاله ISI انگلیسی شماره 29144
ترجمه فارسی عنوان مقاله

آموزش ساختاری برای شبکه های بیزی با آزمایش جداکننده کامل در بلوک نخست

عنوان انگلیسی
Structural learning for Bayesian networks by testing complete separators in prime blocks
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
29144 2011 13 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computational Statistics & Data Analysis, Volume 55, Issue 12, 1 December 2011, Pages 3135–3147

ترجمه کلمات کلیدی
شبکه های بیزی - جداکننده کامل - استقلال شرطی - لبه اخلاقی - بلاک - نخست - آموزش سازه -
کلمات کلیدی انگلیسی
Bayesian network, Complete separator, Conditional independence, Moral edge, Prime block, Structural learning,
پیش نمایش مقاله
پیش نمایش مقاله  آموزش ساختاری برای شبکه های بیزی با آزمایش جداکننده کامل در بلوک نخست

چکیده انگلیسی

In this paper, we consider how to recover the structure of a Bayesian network from a moral graph. We present a more accurate characterization of moral edges, based on which a complete subset (i.e., a separator) contained in the neighbor set of one vertex of the putative moral edge in some prime block of the moral graph can be chosen. This results in a set of separators needing to be searched generally smaller than the sets required by some existing algorithms. A so-called structure-finder algorithm is proposed for structural learning. The complexity analysis of the proposed algorithm is discussed and compared with those for several existing algorithms. We also demonstrate how to construct the moral graph locally from, separately, the Markov blanket, domain knowledge and dd-separation trees. Simulation studies are used to evaluate the performances of various strategies for structural learning. We also analyze a gene expression data set by using the structure-finder algorithm.

مقدمه انگلیسی

Bayesian networks play an important role in decision making and statistical inference and have been applied in many fields such as machine learning and bioinformatics. They offer powerful knowledge representations for independence, conditional independence and causal relationships among variables in a given domain (Whittaker, 1990, Lauritzen, 1996, Cowell et al., 1999, Pearl, 2000, Spirtes et al., 2000 and Jensen and Nielsen, 2007). Briefly, a Bayesian network is a directed graph with no directed cycles (i.e., a directed acyclic graph, denoted as DAG) G=(V,E)G=(V,E) with a probability distribution PP. The vertices in GG usually represent random variables X=(Xv)v∈VX=(Xv)v∈V, and PP is the joint probability distribution of XX with P(x)=∏v∈VP(xv|xpa(v))P(x)=∏v∈VP(xv|xpa(v)), where P(xv|xpa(v))P(xv|xpa(v)) is a conditional distribution and pa(v)pa(v) is the set of parents of vv. The DAG is its qualitative component which represents dependence and independence relationships. That is, the absence of some directed edges represents the existence of certain conditional independence relationships among variables, and the presence of edges represents the existence of direct dependence relationships or causal relationships. The joint probability distribution is its quantitative component that represents the strength of association between variables. For a Bayesian network G=(V,E)G=(V,E) with a distribution PP, if subsets AA and B⊂VB⊂V are dd-separated by SS in GG (see the corresponding definition in Section 2) then XAXA and XBXB are conditionally independent given XSXS with respect to PP. In this paper, we assume that all conditional independencies among variables implied in the true distribution can be indicated by dd-separations of some DAGs, which is called the faithfulness assumption (Spirtes et al., 2000). Two DAGs over the same variable set are said to be Markov equivalent if they represent the same conditional independencies among variables. An equivalence class of Markov equivalent DAGs is represented by a partially directed DAG (PDAG) in which the directed edges are common to every DAG, while the undirected edges may be oriented in one way in some DAGs and in another way in other DAGs. Hence, the goal of structural learning is to construct a partially directed graph to represent the equivalence class from an observed data set. In this paper, we consider the problem of the learning structure of a Bayesian network. This problem has been widely discussed by many authors (e.g. Cowell et al., 1999 and Spirtes et al., 2000) and references therein). There are two major kinds of algorithms for structural learning, namely constraint-based approaches (Spirtes and Glymour, 1991, Verma and Pearl, 1990, Xie et al., 2006, Xie and Geng, 2008, Ma et al., 2008 and Liu et al., 2010) and search-and-score approaches (Chickering, 2002 and Heckerman et al., 1995). We focus on the constraint-based approaches in this article. For many constraint-based algorithms, two major steps are usually adopted to recover the DAG structure (Xie and Geng, 2008). First, they learn the moral graph of the target DAG by applying Markov boundary learning algorithms, where the Markov boundary for a variable uu is defined to be the set of variables composed of uu’s parents, its children, and its children’s other parents (Pearl, 1988). Second, they perform further independence tests for deleting the moral edges and orienting edges on the basis of the moral graph learned in the first step. Therefore, as mentioned in (Xie and Geng, 2008), in a constraint-based algorithm, searching for separators of pairs of variables is a major challenge for the orientation of edges and for the structure recovery of a DAG. Here a separator is a subset of variables given which the variable pairs are conditionally independent. Verma and Pearl (1990) proposed the inductive causation (IC) algorithm that searches for a separator from all possible subsets of the vertex set. Spirtes and Glymour (1991) proposed the PC algorithm, which is a general systematic way of searching for a separator in increasing order of cardinality within a constraint-based framework. However, searching for a separator is limited to those vertices that are adjacent to the vertex pairs in the PC algorithm. Geng et al. (2005) decomposed the moral graph into subgraphs (i.e., prime blocks), and then searched for a separator in all possible subsets in prime blocks. In this paper, we present a more accurate characterization of moral edges in the moral graph. We show that a separator corresponding to a putative moral edge can be obtained in the set of complete subsets contained in the neighbor set of one vertex of the putative moral edge in some prime block. Using this result, we propose a structure-finder algorithm for structural learning from the moral graph, which substantially improves on the IC algorithm and the decomposition approach proposed by Geng et al. (2005). We also discuss how to construct moral graphs using marginal data locally. The rest of this paper is organized as follows. In Section 2, we introduce notation and definitions. In Section 3, we give the characterizations of moral edges, and present the structure-finder algorithm and a complexity analysis. Section 4 discusses how to construct the moral graph locally from, separately, the Markov blanket, domain knowledge and dd-separation trees. Simulation studies are conducted to demonstrate the performance of our algorithm and existing algorithms in Section 5. We analyze a gene expression data set in Section 6. Brief conclusions are drawn in Section 7. All proofs will be presented in the Appendix.

نتیجه گیری انگلیسی

7. Conclusion In this paper, we have given a more accurate characterization of moral edges in the moral graph and proposed the SF algorithm for structural learning, which substantially improves on the IC algorithm and the Geng et al. approach. Although the SF algorithm depends on the quality of constructed moral graphs, the simulation results show that the SF algorithm using our constructed moral graph outperforms the PC algorithm, the TPDA algorithm and the MMHC algorithm according to the SHD criterion in most cases. Although we assume in this paper that the data are completely observed, missing data or data with latent variables may arise in practice. The IC∗∗ algorithm proposed by Pearl (2000) can recover latent structures and it is similar to the IC algorithm. Generalization of our SF algorithm to missing data or data with latent variables is of great research interest.