تجزیه و تحلیل حساسیت بیزی از مدل های غیرخطی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
26684 | 2013 | 9 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Mechanical Systems and Signal Processing, Volume 34, Issues 1–2, January 2013, Pages 57–75
چکیده انگلیسی
Sensitivity analysis allows one to investigate how changes in input parameters to a system affect the output. When computational expense is a concern, metamodels such as Gaussian processes can offer considerable computational savings over Monte Carlo methods, albeit at the expense of introducing a data modelling problem. In particular, Gaussian processes assume a smooth, non-bifurcating response surface. This work highlights a recent extension to Gaussian processes which uses a decision tree to partition the input space into homogeneous regions, and then fits separate Gaussian processes to each region. In this way, bifurcations can be modelled at region boundaries and different regions can have different covariance properties. To test this method, both the treed and standard methods were applied to the bifurcating response of a Duffing oscillator and a bifurcating FE model of a heart valve. It was found that the treed Gaussian process provides a practical way of performing uncertainty and sensitivity analysis on large, potentially-bifurcating models, which cannot be dealt with by using a single GP, although an open problem remains how to manage bifurcation boundaries that are not parallel to coordinate axes.
مقدمه انگلیسی
Uncertainty analysis (UA) is a field of increasing interest in computer modelling, both within and beyond the realm of engineering. Advances in the theory of numerical simulation (such as the continuing advances in finite element (FE) analysis and computational fluid dynamics), coupled with a steady increase in available processing power, have enabled the use of increasingly sophisticated simulations to model complicated real-world processes. However, the increased complexity of these models tends to require a greater amount of information to be specified at the input. Examples of input variables within the context of engineering models could include material properties, loads, dimensions and temperatures. Any of these inputs can be subject to some uncertainty; for example, the operating temperature of a structure could be within a wide range, or the material properties could vary naturally (a good example of this is in modelling biomaterials—see [1]). The question, central to UA, then arises: what is the uncertainty in the model output, given the input uncertainty? A furtherance of UA is Sensitivity Analysis (SA), which evaluates the contribution of inputs (and sets of inputs) to the output uncertainty, and the effects of varying parameters within specified ranges. The motivations of SA are various (Saltelli provides an extensive list [2]), including the identification of particularly influential parameters to see whether their uncertainty can be reduced somehow (thereby reducing output uncertainty), and gaining deeper understanding of a model’s response to its inputs. In general, SA is seen as a useful tool to increase the robustness and overall quality of computer simulations, and often goes hand in hand with UA. Introducing some notation, let the model to be investigated be denoted as f(x), where x is a d-dimensional vector of uncertain inputs (there may also be a number of model inputs that are regarded as known, which will not be considered here). The model may indeed be a very complex system of nonlinear equations, but from a “black box” point of view it can simply be regarded as a function of inputs. The model could yield any number of outputs, but attention here will be restricted to a scalar output y=f(x). In order to perform an uncertainty analysis, it is necessary to somehow quantify the uncertainty on the inputs. Although there are many ways of doing this – see the recent book by Klir for a summary [3] – this article will consider the probabilistic framework. It is therefore assumed from here on that the inputs are random variables View the MathML source{Xi}i=1d, and that a joint probability density function (pdf) p(x) can be defined over them. This lightly skips over the often-troublesome problem of eliciting the pdf, though an extensive discussion on this matter can be found in [4]. From the probabilistic perspective, the aim of UA is to estimate the pdf of the output random variable Y, given the p(x) assigned to the inputs. In practice, this is often simplified to the estimation of the expected value E(Y), and the variance var(Y). SA, on the other hand, requires a quantification of the sensitivity of y to changes in the inputs. Saltelli discusses a great number of ways to do this [5]. Two simple types of SA are simply ranking variables in order of importance (screening); and taking partial derivatives of y with respect to individual inputs (local SA). Screening is usually done as a precursor to a more sophisticated SA, to filter out unimportant variables. Local SA is also useful, but only considers small perturbations about the nominal values of input parameters. In the case of complex nonlinear models with wide uncertainties, this approach is unsatisfactory. The most informative approach is known as global SA, in which the variation of y is examined over the whole of the input space. By far the most widely used method of doing this, proposed by Sobol’ [6], involves decomposing var(Y) into portions attributable to inputs and subsets of inputs. Specifically, a variance decomposition is used (under an assumption of independence between inputs) such that equation(1) View the MathML sourcevar(Y)=∑iVi+∑i∑j>iVij+⋯+V12…d Turn MathJax on Vi=var(E(Y|Xi))Vi=var(E(Y|Xi)) Turn MathJax on Vij=var{E(Y|Xi,Xj)}−var{E(Y|Xi)}−var{E(Y|Xj)}⋮Vij=var{E(Y|Xi,Xj)}−var{E(Y|Xi)}−var{E(Y|Xj)}⋮ Turn MathJax on The quantity Vi represents the amount that var(Y) would be reduced, if Xi were to become known, therefore giving a global measure of importance of that variable. The higher order terms Vij represent variance due to interactions between variables, additional to the variance caused by the inputs acting alone. There are 2d−1 terms in this decomposition, with interactions up to the final d-way interaction between all variables (the last term in Eq. (1)). Typically these quantities are standardised by dividing by var(Y), giving, e.g. Si=Vi/var(Y), where Si is known as the main effect index (MEI), and the Sij, Sijk, etc. follow from similar definitions. Note that this implies that equation(2) View the MathML source∑iSi+∑i∑j>iSij+⋯+S12…k=1 Turn MathJax on Since there may be a great number of terms in Eq. (2), Homma and Saltelli proposed a further quantity STi, known as the total sensitivity index (TSI) [7], which is defined as equation(3) View the MathML sourceSTi=E{var(Y|X−i)}var(Y)=1−var{E(Y|X−i)}var(Y) Turn MathJax on This measure is therefore the sum of the variance caused by Xi and all its interactions with other variables. Finally, another useful (but qualitative) indicator of sensitivity is to plot the main effect E(Y|Xi)against Xi, which illustrates how the output varies with respect to variations in a single variable. All the quantities addressed here can be expressed as integrals, e.g. equation(4) E(Y|Xi)=∫X−if(x)p(x−i|xi)dx−iE(Y|Xi)=∫X−if(x)p(x−i|xi)dx−i Turn MathJax on where X−i denotes the sample space (support) of the variables X−i, where –i denotes all variables except i. If the function f(x) were tractable, this could be done analytically, however in the majority of practical cases it is not (consider a large FE model, for instance). In such cases, the usual approach is to use the Monte Carlo (MC) method [8], using specific estimators built for the purpose [9] and [10]. To achieve an acceptable level of accuracy, this requires sampling the model a large number NMC of times across the input space (at least several hundred, often thousands) for every quantity to be calculated—i.e. d×NMC samples for the MEIs, plus the same number again for the TSIs. The key problem with this approach is that many large models require minutes, hours or longer to evaluate f(x) at a single input point, therefore the cost of a thorough sensitivity analysis can be completely prohibitive. In order to reduce computational expense, a class of methods exist which involve building an emulator (equivalently a metamodel or surrogate model) which imitates the behaviour of the original model, but is considerably (computationally) cheaper to run. The emulator is trained from a small number of n training data, consisting of model runs at selected values of input variables, the idea being that n⪡NMC. It may then be used to provide uncertainty and sensitivity estimates either by applying MC techniques to the emulator, or even better, by analytically propagating uncertainty if the emulator is sufficiently tractable. Many types of emulators exist: well-known choices range from linear models (including linear combinations of basis functions, such as polynomials, Fourier expansions and other orthogonal expansions), radial basis functions, artificial neural networks, splines and support vector machines, all of which are discussed in [11]. Some newer approaches include “gradient boosting machines” [12] (a method based on nested “weak learners”), ACOSSO [13] (a method using multivariate splines and variable selection), “multivariate adaptive regression splines” (MARS) [14] (using an optimised combination of linear splines), and Gaussian processes (GPs) [15]. Comparisons of these methods can be found in [16], [17], [18] and [19]. It is clear that desirable properties of an emulator include efficiency (the ability to emulate a model for as few training data as possible), and ideally tractability (though this is often circumvented using MC). Perhaps most importantly is the ability to emulate a wide class of model responses, which could be called “flexibility” or “generality”. As a simple example, consider fitting a linear regression to data that varies sinusoidally—it is clear that this will provide a very poor approximation of the true model. Changing basis functions or using some of the more sophisticated emulators described above allows the emulation of nonlinear models, but severe nonlinearities pose difficulties. Of particular interest to this paper is the case when a model has a discontinuity in its response, perhaps caused by a bifurcation. Emulators such as ACOSSO and MARS are based on splines, which assume continuous functions, therefore are not suited to such problems. Similarly, GPs assume a smooth response. This work investigates a new emulator introduced by Gramacy and Lee [20] which uses Classification and Regression Trees (CARTs) to divide input space, and fits GPs in each region. If the divisions are located at bifurcation points, this gives the possibility to faithfully model severely nonlinear data. Such “treed GPs” were originally developed to model heteroskedastic data (data with a non-constant variance). In this work the treed GP will be tested in the context of uncertainty and sensitivity analysis of bifurcating or discontinuous models. While the approach here is not the invention of the authors of this paper, the intention of this paper is to present this cutting-edge emulation method in a walkthrough format that is approachable by statistically-minded engineers, in a sense “bridging the gap” between the computer science and engineering communities. At the same time, its relevance to mechanical engineering is demonstrated on two case studies that illustrate the power of the approach. This paper is organised as follows: in Section 2 the theory of GPs is introduced, after which treed GPs are discussed in Section 3. In Sections 4 and 5, treed and non-treed GPs are applied to bifurcating data generated by a Duffing oscillator, followed by a bifurcating biomechanical FE model. Conclusions follow.
نتیجه گیری انگلیسی
This work has shown that the treed GP is capable of modelling bifurcating and discontinuous response surfaces that are beyond the capability of standard GPs and other standard nonlinear regression approaches, and lends itself readily to uncertainty and sensitivity analysis. The two case studies used here have highlighted that there is a need for such methods in engineering, since bifurcating responses can occur through, for example, the cubic stiffness term in the Duffing oscillator, or buckling in the case of the heart valve. The treed GP does have a shortcoming, in that it is not able to accurately model bifurcations that are not parallel to coordinate axes (Kim et al. [46] have indeed addressed this issue to some extent with the use of Voronoi tessellations). However, estimations of posterior predictive variance, combined with diagnostics such as cross-validation, allow bad fits to be easily identified, and further training data to be directed to regions of high uncertainty. It is also worth emphasising that when a bifurcation is not present, the treed GP will revert to a single GP, or indeed a Bayesian linear regression, which avoids over-fitting.