مدل سازی معناشناسی دانش کیفی ناسازگار برای استنتاج شبکه های بیزی کیفی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|28617||2008||11 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Neural Networks, Volume 21, Issues 2–3, March–April 2008, Pages 182–192
We propose a novel framework for performing quantitative Bayesian inference based on qualitative knowledge. Here, we focus on the treatment in the case of inconsistent qualitative knowledge. A hierarchical Bayesian model is proposed for integrating inconsistent qualitative knowledge by calculating a prior belief distribution based on a vector of knowledge features. Each inconsistent knowledge component uniquely defines a model class in the hyperspace. A set of constraints within each class is generated to describe the uncertainty in ground Bayesian model space. Quantitative Bayesian inference is approximated by model averaging with Monte Carlo methods. Our method is firstly benchmarked on ASIA network and is applied to a realistic biomolecular interaction modeling problem for breast cancer bone metastasis. Results suggest that our method enables consistently modeling and quantitative Bayesian inference by reconciling a set of inconsistent qualitative knowledge.`
Bayesian reasoning provides a probabilistic approach to inference. In Bayesian framework, quantities of interest are described by probabilities and optimal decisions can be made by reasoning about these probabilities together with the observation or evidence. Bayesian reasoning is important to machine learning because it provides a quantitative approach to weighting the evidence supporting alternative hypotheses. Numerous algorithms have been proposed for learning the Bayesian network structure and parameter from the observed data. These algorithms produce a single Bayesian model by maximizing its probability given the training data, i.e. maximum a posterior approximation. In realistic problem, learning Bayesian model by training data requires relatively large amount of observed data comparing to the size of network. However, the data basis is often very sparse and it is hardly sufficient to select one adequate model due to the model uncertainty, thus, selecting a single model may induce overfitting to the data and can lead to strongly biased inference results. It is therefore preferable to adopt a full Bayesian approach with model averaging. Besides the training data, the prior background knowledge provides many ways to adjust uncertainties. The prior background knowledge includes qualitative and quantitative knowledge which describes the entities and their relationships with different levels of abstraction. Quantitative knowledge can be exemplified by a probability elicitation procedure from a domain expert. In most domains, this is particularly difficult due to the limitations of expert knowledge in this level. In contrast, qualitative knowledge, which only provides loose constraints with uncertainty on the entities and their relations exist in many science and engineering domains. For example, in biomedicine, the statement: “Gene CTGF, IL11 and OPN cooperatively activate bone metastasis in breast cancer ”, entities are gene CTGF, IL11, OPN and Bone metastasis in breast cancer , their qualitative relation: cooperatively activate . In some cases, there are properties which further specify the qualitative relationship. In “The risk of lung cancer among smokers is approximately 10 times higher than non-smokers ”, Smoking cause lung cancer and the influence is 10 times higher to non-smokers . In recent studies ( Chang and Stetter, 2007a and Chang and Stetter, 2007b), it is shown that qualitative knowledge can be used and translated into a set of constraints on the Bayesian model space. This set of constraints defines the model uncertainty in structure and parameter space respectively. The model uncertainty represented by the qualitative knowledge enables the full Bayesian approach where a class of Bayesian networks which are consistent with the semantics of the set of qualitative hypotheses are drawn according to the model uncertainty. The probabilistic network inference and reasoning can be derived by performing quantitative prediction and inference in each of the Bayesian model and these quantitative results are averaged weighted by the model posterior probability. This approach has been successfully applied to both well-known benchmark model and real-world application. However, one significant drawback of this qualitative knowledge-driven probabilistic network modeling and inference approach is its incapability of dealing with inconsistent qualitative knowledge. It is well-known that knowledge are often inconsistent, i.e. in the same domain, there may exist contradicting qualitative statements on dependency, causality and parameters over a set of entities. Therefore, it is imperative to develop methods for reconciling inconsistent qualitative knowledge and for modeling Bayesian networks and performing quantitative prediction. In this paper, we propose a novel framework for performing quantitative Bayesian inference with model averaging based on the inconsistent qualitative statements as a coherent extension of framework of quantitative Bayesian inference based on a set of consistent hypotheses introduced in Chang and Stetter (2007b). Our method interprets the qualitative statements by a vector of knowledge features whose structure can be represented by a hierarchical Bayesian network. The prior probability for each qualitative knowledge component is calculated as the joint probability distribution over the features and can be decomposed into the production of the conditional probabilities of the knowledge features. These knowledge components define multiple Bayesian model classes in the hyperspace. Within each class, a set of constraints on the ground Bayesian model space can be generated. Therefore, the distribution of the ground model space can be decomposed into a set of weighted distributions determined by each model class. This framework is used to perform full Bayesian inference which can be approximated by Monte Carlo methods, but is analytically tractable for smaller networks and statement sets. In Section 2, we introduce some related works which have been previously reported and related to our approach. We also clarify the contribution of this approach comparing to the approach proposed previously in Chang and Stetter (2007b). In Section 3, we propose the hierarchical knowledge model for modeling and integrating qualitative knowledge. In Section 4, we describe the quantitative Bayesian inference method with model averaging based on the inconsistent qualitative knowledge. In Section 5, we firstly benchmarked our method with ASIA network by reconciling a set of inconsistent hypotheses with regard to the interactions between pairs of variables to model Bayesian networks and performing quantitative inference based on these Bayesian networks. Then we apply our method to integrate a set of realistic inconsistent knowledge with regard to the TGFββ-Smad signaling pathway in the breast cancer bone metastasis network for constructing the Bayesian models and performing quantitative inference. Conclusions and further discussion are provided in Section 6.
نتیجه گیری انگلیسی
In this paper, we proposed an extension to the approach introduced in Chang and Stetter (2007b) which is a baseline method for qualitative knowledge-driven probabilistic network modeling and quantitative inference. In this extended approach, we focused on handling the case of inconsistent qualitative knowledge, i.e. model the uncertainty in the knowledge space and model class space since one piece of knowledge component determines a class of Bayesian networks with unique structure and various parameter settings. We do so by constructing a hierarchical qualitative knowledge feature network. Thus, the probability distribution of the set of inconsistent knowledge can be calculated as the joint probability of these features. In this way, we can reconcile the inconsistent knowledge into a belief distribution over the knowledge space, i.e. Bayesian class space; Each Bayesian class represent different structures and parameter settings over the set of domain variables and is assigned a belief according to the probability distribution over the knowledge space. Quantitative predictions and inference are calculated by averaging the predictions of each ground Bayesian network over all Bayesian network classes. We benchmarked our approach with the well-known ASIA network. We construct the probability distribution in ASIA model structure and parameter space according to a set of inconsistent qualitative knowledge. As well, the uncertainty in the knowledge space is computed based on the hierarchical knowledge feature model. We draw ASIA model samples from each Bayesian network class across the knowledge space. Probabilistic inference are calculated by averaging individual inference result in each ground ASIA model weighted by the model prior distribution given the Bayesian model class and the prior distribution of the Bayesian model class. We compare our prediction to those performed in the true ASIA model and we conclude that our method can reconcile the inconsistent knowledge parts and make reasonable probabilistic predictions. Similarly, we applied our method to a realistic problem, i.e. constructing the molecular signaling pathway in the breast cancer bone metastasis based on a set of inconsistent qualitative knowledge on the molecular interactions. In this case, dynamic Bayesian networks are employed to represent the recurrent model structure space, i.e. interactions between the molecules in the pathway. The parameter space of the dynamic Bayesian network is modeled according to the qualitative knowledge model. The probability distribution in the knowledge space is computed based on the hierarchical knowledge feature model. We predict the probability of bone metastasis formed by various cell lines with different configurations on the Smad-family protein expression levels based on the set of inconsistent knowledge. Comparing the prediction results to the experimental observations, we conclude that our method can reasonably perform quantitative inference in a realistic problem. For the future research, firstly, it is necessary to develop a framework for integrating incomplete qualitative knowledge to form a complete and consistent representation of the Bayesian networks from the independent sub-networks built from the qualitative knowledge. Secondly, it is important to develop an integrative framework which merges the knowledge-driven Bayesian inference approach with the data-based Bayesian inference algorithms. The knowledge-driven approach shall provide a general scheme for constructing the informative prior distribution on the model structure and parameter space based on the qualitative background knowledge in Bayesian network learning algorithms.