اصلاحات خود راه انداز تقریبی از AIC برای مدل رگرسیون خطی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
24295 | 2010 | 8 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Signal Processing, Volume 90, Issue 1, January 2010, Pages 217–224
چکیده انگلیسی
The Akaike information criterion, AIC, and its corrected version, AICcAICc are two methods for selecting normal linear regression models. Both criteria were designed as estimators of the expected Kullback–Leibler information between the model generating the data and the approximating candidate model. In this paper, two new corrected variants of AIC are derived for the purpose of small sample linear regression model selection. The proposed variants of AIC are based on asymptotic approximation of bootstrap type estimates of Kullback–Leibler information. These new variants are of particular interest when the use of bootstrap is not really justified in terms of the required calculations. As its the case for AICcAICc, these new variants are asymptotically equivalent to AIC. Simulation results which illustrate better performance of the proposed AIC corrections when applied to polynomial regression in comparison to AIC, AICcAICc and other criteria are presented. Asymptotic justifications for the proposed criteria are provided in the Appendix.
مقدمه انگلیسی
In a variety of scientific and engineering modelling problems, an investigator is generally confronted with the problem of determining a suitable model from a class of candidate models to describe or characterize an experimental data set. For example, this is the case in image segmentation [1], signal denoising [2], channel order estimation [3], estimating the number of signals arriving at an array [4], estimating the number of principal components in PCA [5] and in the determination of filter order in adaptive estimation [6]. A companion to the problem of model determination or selection in data modelling is the problem of parameter estimation which is generally solved by maximum likelihood or least squares. The selection of a model is often facilitated by the use of a model selection criterion where one only has to evaluate two simple terms. The underlying idea of model selection criteria is the parsimonious principle which says that there should be a trade-off between data fitting and complexity. Thus all criteria have one term defining a measure of fit, typically a deviance statistic and one term characterizing the complexity, a multiple of the number of free parameters in the model also called penalty term. Largely stimulated by the ground-breaking work of Akaike [7], different strategies have been used to derive different model selection criteria. The minimum description length (MDL) suggested in [8] implement the parsimony principle of economy to code a data set using the idea of universal coding introduced by Kolmogorov. Therefore, the model chosen with the MDL can be considered as providing the best explanation of the data in terms of code length. In [9] and [10], based on Bayesian arguments and maximum a posteriori probability, the Bayesian information criterion (BIC) was introduced. A model selection criterion can also be designed to estimate an expected overall discrepancy, a quantity which reflects the degree of disparity between a fitted approximating model and the generating or “true” model. Estimation of Kullback–Leibler information [11] is the key to derive the Akaike information criterion, AICAIC [12]. The FPE [13] and CpCp [14] criteria follow from estimating the L2L2 distance between the fitted candidate model and the true model. From the estimation of Kullback–Leibler symmetric divergence [15] follows the Kullback information criterion, KICKIC [16]. In [17], a thorough investigation on the role of cross-validation was made and the use of leave-many-out cross-validation, which requires intensive computing, was suggested for model selection. These criteria are both computationally and heuristically appealing, which partly explains their enduring popularity among practitioners. Except the techniques based on cross-validation, these criteria suffer from one commonly observed drawback: their penalty terms are simple minded bias corrections and there is no assurance that such penalty terms yield a good model order estimate. Indeed, these criteria have the tendency to produce a wrong model order estimation when the sample size is small relative to the larger model order represented within the class of approximating models. Many attempts have been made to improve these criteria and AIC particularly (on which we focus in this paper) by reducing their finite sample bias (or correcting their penalty terms). One such approach is to asymptotically evaluate the penalty terms as precisely as possible to provide better estimates of the model order. In practice such precise asymptotic approximation can be quite effective [18]. As an alternative to using asymptotic approximation techniques to evaluate penalty terms one can use bootstrap techniques to estimate these terms. The idea of using bootstrap to improve the performance of a model selection rule has been suggested and investigated in [19] and [20]. There are various advantages of using bootstrap estimate instead of asymptotic approximations. Firstly, the obtained criteria are more general than those obtained using asymptotic approximations. Secondly, bootstrapping may produce better finite-sample accuracy than asymptotic approximations. However, the computational burden required to evaluate these bootstrap based penalty terms is not always justifiable in practice, which makes the availability of different asymptotic approximations still interesting. In this paper, two new corrected variants of AIC are derived for the purpose of small sample linear regression model selection. These proposed AIC variants can be considered as alternative to AICcAICc[18] when the bootstrap based penalty terms are not justifiable in terms of the required calculations. Since AIC is a well established and widely used criterion, any criterion which can have a better performance will be an attractive substitute. The rest of the paper is organized as follows. In the next section a review of AIC is provided. Section 3 describes the different existing bootstrap based penalty terms that can be used instead of the simple minded bias correction used in AIC to estimate the Kullback–Leibler information. The two new variants of AIC are derived in Section 4. In Section 5, the performance of the proposed criteria are compared to that of other criteria in a simulation example based on polynomial regression. Section 6 examines the probabilities of overfitting in small sample size of one of the proposed criteria. Concluding remarks are given in Section 7 and the theoretical justifications of the purposed criteria are presented in the Appendices.
نتیجه گیری انگلیسی
Based on the asymptotic approximation of bootstrap type estimates of Kullback–Leibler divergence, two new AIC corrected variants have been proposed for small sample linear regression model selection. The results of 5 and 6 suggest that AICc3AICc3 functions as an effective model selection criterion in small sample applications when the use of bootstrap is not really justified in terms of the required calculations. The performance of AICc3AICc3 does not diminish the benefits of AIC bootstrap corrected versions, on the contrary it reinforces their advantages since asymptotically we obtain effective AIC corrected versions. Also the bootstrap corrections remain of particular interest in comparison to non-bootstrap corrections when bootstrapping is really advantageous in terms of the required calculations. Since the difference between AIC and any of the proposed corrected versions AICc2AICc2 or AICc3AICc3 is a non-stochastic term of order 1/n1/n, these proposed criteria are as AIC [36], asymptotically efficient. By definition, an efficient criterion will asymptotically select the fitted candidate model that minimizes the one-step squared prediction error. The type of correction proposed in this paper is based on assuming a particular modelling structure of the candidate family MkMk and using the presumed structure to derive a more precise approximation for the bias refinement (3). The demonstrated effectiveness of these proposed corrections as selection criteria substantiates the extension of this approach to develop new corrections for other popular model structures.