رگرسیون بردار پشتیبانی از داده های شبیه سازی و چند نمونه تجربی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
24943 | 2008 | 15 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Information Sciences, Volume 178, Issue 20, 15 October 2008, Pages 3813–3827
چکیده انگلیسی
This paper considers nonlinear modeling based on a limited amount of experimental data and a simulator built from prior knowledge. The problem of how to best incorporate the data provided by the simulator, possibly biased, into the learning of the model is addressed. This problem, although particular, is very representative of numerous situations met in engine control, and more generally in engineering, where complex models, more or less accurate, exist and where the experimental data which can be used for calibration are difficult or expensive to obtain. The first proposed method constrains the function to fit to the values given by the simulator with a certain accuracy, allowing to take the bias of the simulator into account. The second method constrains the derivatives of the model to fit to the derivatives of a prior model previously estimated on the simulation data. The combination of these two forms of prior knowledge is also possible and considered. These approaches are implemented in the linear programming support vector regression (LP-SVR) framework by the addition, to the optimization problem, of constraints, which are linear with respect to the parameters. Tests are then performed on an engine control application, namely, the estimation of the in-cylinder residual gas fraction in Spark Ignition (SI) engine with Variable Camshaft Timing (VCT). Promising results are obtained on this application. The experiments have also shown the importance of adding potential support vectors in the model when using Gaussian RBF kernels with very few training samples.
مقدمه انگلیسی
The general problem of how to efficiently incorporate knowledge given by a prior simulation model into the learning of a nonlinear model from experimental data can be presented from an application point of view. Consider the modeling of the in-cylinder residual gas fraction in Spark Ignition (SI) engine with Variable Camshaft Timing (VCT) for engine control. In this context, experimental measurements are complex and costly to obtain. On the other hand, a simulator built from physical knowledge can be available but cannot be embedded in a real time controller. In engine control design (modeling, simulation, control synthesis, implementation and test), two types of models are commonly used: • Low frequency models or Mean Value Engine Models (MVEM) with average values for the variables over the engine cycle. These models are often used in real time engine control [1] and [5]. However, they must be calibrated on experiments in sufficiently large number in order to be representative. • High frequency simulation models that can simulate the evolution of the variables during the engine cycle [6]. These models, of various complexity from zero-dimensional to three-dimensional models, are mostly based on fewer parameters with physical meaning. However, they cannot be embedded in real time controllers. The idea is thus to build an embeddable black box model by taking into account a prior simulation model, which is representative but possibly biased, in order to limit the number of required measurements. The prior model is used to generate simulation data for arbitrarily chosen inputs in order to compensate for the lack of experimental samples in some regions of the input space. This problem, although particular, is representative of numerous situations met in engine control, and more generally in engineering, where complex models, more or less accurate, exist, providing prior knowledge in the form of simulation data, and where the experimental data which can be used for calibration are difficult or expensive to obtain. The following of the paper studies various methods for the incorporation of these simulation data into the training of the model. In nonlinear function approximation, kernel methods, and more particularly Support Vector Regression (SVR) [24], have proved to be able to give excellent performances in various applications [18], [17] and [14]. SVR aims at learning an unknown function based on a training set of N input-output pairs (xi,yi)(xi,yi) in a black box modeling approach. It originally consists in finding the function that has at most a deviation εε from the training samples with the smallest complexity [22]. Thus, SVR amounts to solve a constrained optimization problem, in which the complexity, measured by the norm of the parameters, is minimized. Allowing for the cases where the constraints can not all be satisfied (some points have larger deviation than εε) leads to minimize an εε-insensitive loss function, which yields a zero loss for a point with error less than εε and corresponds to an absolute loss for the others. The SVR algorithm can thus be written as a quadratic programming (QP) problem, where both the ℓ1ℓ1-norm of the errors larger than εε and the ℓ2ℓ2-norm of the parameters are minimized. To deal with nonlinear tasks, SVR uses kernel functions, such as the Radial Basis Function (RBF) kernel, which allow to extend linear methods to nonlinear problems via an implicit mapping in a higher dimensional feature space. Compared to neural networks, SVR has the following advantages: automatic selection and sparsity of RBF centers, intrinsic regularization, no local minima (convex problem with a unique solution), and good generalization ability from a limited amount of samples. In addition, the εε-insensitive loss improves the robustness to outliers compared to quadratic criteria. Other formulations of the SVR problem minimizing the ℓ1ℓ1-norm of the parameters can be derived to yield linear programs (LP) [25], [23] and [16]. Some advantages of this latter approach can be noticed compared to the QP formulation such as an increased sparsity of support vectors [25] and [23] or the ability to use more general kernels [15]. The remaining of the paper will thus focus on the LP formulation of SVR (LP-SVR). After a presentation of the LP-SVR problem (Section 2), the paper uses the framework of [10] to extend the problem with additional constraints, that are linear with respect to the parameters, in order to include prior knowledge in the learning (Section 3). The methods are exposed respectively for the inclusion of knowledge on the output values (Section 3.1), on the derivatives of the model (Section 3.2) and the addition of potential support vectors (Section 3.3). Finally, the various ways of incorporating prior knowledge in the form of simulation data with these techniques are tested on the in-cylinder residual gas fraction data in Section 4. Notations: all vectors are column vectors written in boldface and lowercase letters whereas matrices are boldface and uppercase, except for the i th column of a matrix AA that is denoted AiAi. The vectors 0 and 1 are vectors of appropriate dimensions with all their components respectively equal to 0 and 1. For A∈Rd×mA∈Rd×m and B∈Rd×nB∈Rd×n containing d -dimensional sample vectors, the “kernel” K(A,B)K(A,B) maps Rd×m×Rd×nRd×m×Rd×n in Rm×nRm×n with K(A,B)i,j=k(Ai,Bj)K(A,B)i,j=k(Ai,Bj), where View the MathML sourcek:Rd×d→R is the kernel function. In particular, if x∈Rdx∈Rd is a column vector then K(x,B)K(x,B) is a row vector in R1×nR1×n. The matrix X∈RN×dX∈RN×d contains all the training samples xixi, i=1,…,Ni=1,…,N, as rows. The vector y∈RNy∈RN gathers all the target values yiyi for these samples. Uppercase Z is a set containing ∣Z∣∣Z∣ vectors that constitute the rows of the matrix ZZ.
نتیجه گیری انگلیسی
This paper uses simple and effective techniques for the incorporation of prior knowledge into LP-SVR learning. This prior information may be given in terms of output values as well as derivative values on a set of points. Various methods based on these techniques have been studied for the inclusion of knowledge in the form of simulation data. The proposed methods have been tested on the estimation of in-cylinder residual gas fraction application. In this context, real data are available only in a limited number due to the cost of experimental measurements, but additional data can be obtained thanks to a complex physical simulator. The output of the simulator being biased but providing rather good information on the overall shape of the model, prior information on the derivatives, provided by a prior model trained on the simulation data, is the most relevant. Models enhanced by this knowledge thus allow to obtain the best performance. The additional hyperparameters of the method weight the prior knowledge with respect to the data and can thus be chosen in accordance with the confidence in the prior information. Moreover, the sensitivity of the method with respect to the tuning of these hyperparameters has been experimentally shown, but only on a particular example, to be very low as long as the data are not too few. Besides, the experiments have also shown the importance of adding potential support vectors in the model when using a local kernel, such as the Gaussian RBF kernel, with few training samples.