The paper investigates a nonparametric regression method based on smoothing spline analysis of variance (ANOVA) approach to address the problem of global sensitivity analysis (GSA) of complex and computationally demanding computer codes. The two steps algorithm of this method involves an estimation procedure and a variable selection. The latter can become computationally demanding when dealing with high dimensional problems. Thus, we proposed a new algorithm based on Landweber iterations. Using the fact that the considered regression method is based on ANOVA decomposition, we introduced a new direct method for computing sensitivity indices. Numerical tests performed on several analytical examples and on an application from petroleum reservoir engineering showed that the method gives competitive results compared to a more standard Gaussian process approach.
The recent significant advances in computational power have allowed computer modeling and simulation to become an integral tool in many industrial and scientific applications, such as nuclear safety assessment, meteorology or oil reservoir forecasting. Simulations are performed with complex computer codes that model diverse complex real world phenomena. Inputs of such computer codes are estimated by experts or by indirect measures and can be highly uncertain. It is important to identify the most significant inputs, which contribute to the model prediction variability. This task is generally performed by the variance-based sensitivity analysis also known as global sensitivity analysis (GSA) (see [1] and [2]).
The aim of GSA for computer codes is to quantify how the variation of an output of the computer code is apportioned to different inputs of the model. The most useful methods that perform sensitivity analysis require stochastic simulation techniques, such as Monte-Carlo methods. These methods usually involve several thousands computer code evaluations that are generally not affordable with realistic models for which each simulation requires several minutes, hours or days. Consequently, meta-modeling methods become an interesting alternative.
A meta-model is an approximation of the computer code's input/output relation, which is fast to evaluate. The general idea of this approach is to perform a limited number of model evaluations (hundreds) at some carefully chosen training input values, and then, using statistical regression techniques to construct an approximation of the model. If the resulting approximation is of a good quality, the meta-model is used instead of the complex and computationally demanding computer code to perform the GSA.
The most commonly used meta-modeling methods are those based on parametric polynomial regression models, which require specifying the polynomial form of the regression mean (linear, quadratic, etc.). However, it is often the case that the linear (or quadratic) model can fail to identify properly the input/output relation. Thus, in nonlinear situations, nonparametric regression methods are preferred.
In the last decade many different nonparametric regression models have been used as a meta-modeling method. To name a few of them [3], [4] and [5] utilized a Gaussian Process (GP). [6] and [7] used a polynomial chaos expansions to perform a GSA.
In addition [8], [9] and [10] provide a comparison of various parametric and nonparametric regression models, such as linear regression (LREG), quadratic regression (QREG), projection pursuit regression multivariate adaptive regression splines (MARS), gradient boosting regression, random forest, Gaussian process (GP), adaptive component selection and smoothing operator (ACOSSO), etc… for providing appropriate metamodel strategies.
We focus in this work on the modern nonparametric regression method based on smoothing spline ANOVA (SS-ANOVA) model and component selection and smoothing operator (COSSO) regularization, which can be seen as an extension of the LASSO [11] variable selection method in parametric models to nonparametric models. Moreover, we use the ANOVA decomposition basis of the COSSO to introduce a direct method to compute the sensitivity indices.
In this paper, we first review the SS-ANOVA, then we will describe the COSSO method and its algorithm. Furthermore we will introduce two new algorithms which provide the COSSO estimates, the first one using an iterative algorithm based on Landweber iterations and the second one using a modified least angle regression algorithm (LARS) (see [12] and [13]). Next we will describe our new method to compute the sensitivity indices. Finally, numerical simulations and an application from petroleum reservoir engineering will be presented and discussed.
In this work, we presented the COSSO regularized nonparametric regression method, which is a model fitting and variable selection procedure. One of the COSSO algorithm steps is the NNG optimization problem. The original COSSO algorithm uses classical constrained optimization techniques to solve the NNG problem. These techniques are efficient but time consuming, especially with high dimensional problems (as empirically shown) and with large size of experimental design (large number of observations). A new iterative algorithm was developed, so-called IPS with its accelerated version (AIPS). Based on the Landweber iterations these procedures are conceptually simple and easy to implement.
We also applied the NN-LARS algorithm to COSSO that has also competitive computation time performance comparing to the original COSSO (COSSO-solver). We empirically show that COSSO based on the AIPS algorithm is the fastest COSSO version.
Moreover, we used the ANOVA decomposition basis of the COSSO to introduce a direct method to compute the Sobol’ indices. We applied COSSO to the problem of GSA for several analytical models and reservoir synthetic test cases, and we compared its performance to GP method combined with Sobol’ Monte-Carlo method. For all the test cases COSSO shows very competitive performances, especially the COSSO–AIPS version, for which the computational gain was significant compared to COSSO-solver and GP. Consequently, COSSO–AIPS constitutes an efficient and practical approach to GSA.
It may be possible to improve the performance of COSSO–AIPS by using an adaptive weight in the COSSO penalty [30] which may allows for more flexibility to estimate influential functional components and in the same time providing heavier penalty to non-influential functional components.