چارچوب اعتبار پیشنهاد شده برای کارشناس مشخص شبکه های بیزی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|29191||2013||6 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 40, Issue 1, January 2013, Pages 162–167
The popularity of Bayesian Network modelling of complex domains using expert elicitation has raised questions of how one might validate such a model given that no objective dataset exists for the model. Past attempts at delineating a set of tests for establishing confidence in an entirely expert-elicited model have focused on single types of validity stemming from individual sources of uncertainty within the model. This paper seeks to extend the frameworks proposed by earlier researchers by drawing upon other disciplines where measuring latent variables is also an issue. We demonstrate that even in cases where no data exist at all there is a broad range of validity tests that can be used to establish confidence in the validity of a Bayesian Belief Network. Highlights ► Bayesian Network models are difficult to validate when data are expert elicited. ► We examine the sources of confidence in Bayesian Belief Networks. ► The validation frameworks from related disciplines are reviewed. ► We propose a framework for Bayesian Network validation building on the ideas reviewed.
Bayesian Networks (BNs) are an increasingly popular tool for modelling complex systems, particularly in the absence of easily accessed data. A BN describes the joint probability distribution of a network of factors using a Directed Acyclic Graph (Pearl, 1988). Factors that influence the likelihood of the outcome node being in any given state are represented as nodes on the graph. If the state of one model factor influences the state of another a directional arc is drawn between the two nodes representing these factors in the model. The combination of the nodes and their relationships is the BN structure. Each node in the graph can adopt any one of a finite set of states. For example, a factor representing magnitude could be classified as ‘high’ or ‘low’. While nodes do not strictly have to be discretised the practice is by far more commonly undertaken than not due to its computational convenience, and as such we do not discuss models that include non-discretised nodes in this paper. Finally, each node and relationship between nodes is quantified according to the likelihood of the node adopting a given state. In the case of input nodes these probabilities are seen as unconditional, whereas nodes internal to the model are dependent upon the states of the preceding nodes. The strength and direction of the relationship between model factors is defined in the conditional probability table associated with the child node. BNs are often created through a process of expert elicitation, in which experts are asked to create a complex systems model by giving their opinions on the model structure, discretisation, and parameterisation. The validity of these models is generally tested through one of two procedures: by comparing the model predictions to data available for the subject matter, or by asking the experts who contributed to the model creation to comment on its accuracy. This paper argues that these tests are limited in their ability to accurately test the validity of BNs, and presents a framework for more thorough validity testing. The work presented here stems from questions raised during the creation of a BN from expert elicitation to model the inbound passenger processing time at Australian airports. The network was elicited in collaboration with managerial and operational experts from Australian Customs and Border Protection Service (ACBPS) for the purpose of gaining more informative reporting of key performance indicators. In particular, the modelling of critical infrastructure underlined the importance of establishing that both experts and modellers have confidence in the final model produced. The paper is structured as follows. First, the concept of validation as it applies to BNs is introduced in Section 1.1. Second, the sources of confidence in BN validity are discussed, including network structure, discretisation, and parameterisation in Section 1.2. Third, prior approaches to validating latent and expert elicited scales and models are introduced, drawing from psychometrics, system dynamics and other BN research in Section 2. These principles are then applied to BNs with examples from the airport inbound passenger processing model in Section 3. 1.1. Confidence in Bayesian Belief Network validity Model validity is often conceptualised as a simple test of a model’s fit with a set of data. However validity is a much broader construct: in essence, validity is the ability of a model to describe the system that it is intended to describe both in the output and in the mechanism by which that output is generated. In this paper we consider this broader definition of validity. The need for an explicit set of validity tests for BNs over and above comparisons with data is clear. In current practice, where data are available on the phenomenon of interest, these data may be used to validate model predictions. Several tests of this nature exist, such as a variety of Normal Maximum Likelihood model selection criteria (Silander, Ross, & Myllymaki, 2009). However, a common reason for using BN models is a lack of available data. Examples of phenomena for which data are scarce include population characteristics in many developing countries (Shakoor, Taylor, & Behrens, 1997), global epidemiological phenomena (Masoli, Fabian, Holt, & Beasley, 2004), organised crime (Sobel & Osoba, 2009), conservation (Johnson, 2009) and biosecurity risk analysis (Barrett, Whittle, Mengersen, & Stoklosa, 2010). In such cases, expert opinion can be elicited to create a Bayesian Belief Network (BBN). A common technique for validating BBNs based on expert opinion in the absence of data, is simply to ask the experts whether they agree with the model structure, discretisation, and parameterisation (see Korb & Nicholson (2010) for an excellent overview of BN applications and methods). This simple test is necessary, but not sufficient, to independently verify the validity of a complex model. Even where data are available, model fit is only a part of the model’s overall validity. These considerations lead to this paper’s proposition of a general validity framework for BNs. 1.2. Sources of confidence in Bayesian Network validity In order to approach a validation framework for BNs, a short discussion of the background assumptions of this framework is required. First, we assume there exists a latent, unobservable ’true’ model (or set of acceptable ‘true’ models) for the phenomenon of interest against which the expert elicited model can be compared. Second, for the purposes of the validity framework presented in this paper, we consider a BN model to consist of four elements: model structure (Section 1.2.1), node discretisation (Section 1.2.2), and discrete state parameterisation(Section 1.2.3). Each of these elements has been raised as a source of uncertainty in BN modelling. We provide a discussion of each element and consider the importance of validity within each model element, and within the model as a whole. The model elements are summarised in Fig. 1. Full-size image (19 K) Fig. 1. Sources of confidence in Bayesian Network validity. Figure options 1.2.1. Structure There are a number of questions when creating the structure of a BN. The first is the appropriate number of nodes to include which is a question of the modelling domain, level and scope. It is widely acknowledged that networks with a large number of nodes can easily become computationally intractable, as can networks with a large number of arcs between nodes (Koller & Pfeffer, 1997). The BN creator should ensure that the model is neither too simple nor too complex in its explanation of the system. 1.2.2. Discretisation The discretisation process allows us to model systems probabilistically by taking continuous factors and assigning them intervals, ordinal states or categories, then modelling over the discrete domain. In more recent research, Uusitalo (2007) pointed out that such discretisation is a major disadvantage of BN modelling if it is necessary for the model, and Myllymaki, Silander, Tirri, and Uronen (2002) outlines how the process has the potential to destroy useful information. Given the information loss inherent in the discretisation process, ensuring that the states are a valid interpretation of the state space of the node is critical for a defensible network. 1.2.3. Parameterisation Parameterisation refers to adding the values elicited from experts to the belief network (Woodberry, Nicholson, Korb, & Pollino, 2005). Much work has been conducted on controlling this stage of the process (Renooij, 2001), but little has been written about how to validate expert responses post-elicitation. 1.2.4. Model behaviour Finally, the behaviour of the model can be seen as the joint likelihood of the entire network as well as its sub-networks and relationships, hence confidence in model behaviour is founded upon the validity of the other three dimensions of the model. It is important to note that in the case of BNs, we are not only interested in whether the model can tell us what a system is doing under certain conditions, but also the factors and relationships that bring about this behaviour. This makes the problem of validating the model incredibly complex when attempted wholesale and justifies the need for partitioning the dimensions of uncertainty for BNs. As such it is recommended that the structure, discretisation and parameterisation are tested for validity before any model behaviour tests can be run.
نتیجه گیری انگلیسی
In this paper we have outlined a broad range of conceptual tests that can be applied to validate BNs. These validity tests incorporate standard model-data fit comparisons, but expand the construct of validity to the broader definition of whether or not a model describes the system it is intended to describe, and produces output it is intended to produce. Many of these validity tests can be used where no objective data exist. By combining existing research from BN validation with validation tests from psychometrics as well alternative complex systems disciplines, this paper introduces a starting point for discussing a framework for building confidence in the validity of BNs. The presented framework is not intended to be comprehensive; instead, the aim is to establish that the validity of a BN can be tested, and should be tested, independent of the model fit to available data or expert confirmation. Disciplines such as psychometrics, with a history of measuring latent constructs, can provide a useful perspective on the problem. The framework presents a sequence of steps that can be followed to establish confidence in model validity, beginning with creating a nomological map of the literature surrounding the domain, then gradually building confidence in six types of model validity, using both general and specific tests. The application of this framework to the BN developed in conjunction with ACBPS will to our knowledge be a novel practical demonstration of such an approach to BN validation. The framework presented in this paper is intended to be domain-general, and there would be great value in establishing the versatility of the tests by applying them to complex models in other domains. Future work will extend to formalising and quantifying many of the tests in the context of BN modelling, and obtaining perspectives on model validity from other disciplines that deal with unobserved variables and complex systems.