در خصوص پیش بینی به دست آمده با مدل راهانداز دو مرحله ای، روش مطالعه شبیه سازی در مدل رگرسیون خطی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|9841||2008||16 صفحه PDF||سفارش دهید||9336 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computational Statistics & Data Analysis,, Volume 52, Issue 5, 20 January 2008, Pages 2778-2793
In many applications of model selection there is a large number of explanatory variables and thus a large set of candidate models. Selecting one single model for further inference ignores model selection uncertainty. Often several models fit the data equally well. However, these models may differ in terms of the variables included and might lead to different predictions. To account for model selection uncertainty, model averaging procedures have been proposed. Recently, an extended two-step bootstrap model averaging approach has been proposed. The first step of this approach is a screening step. It aims to eliminate variables with negligible effect on the outcome. In the second step the remaining variables are considered in bootstrap model averaging. A large simulation study is performed to compare the MSE and coverage rate of models derived with bootstrap model averaging, the full model, backward elimination using Akaike and Bayes information criterion and the model with the highest selection probability in bootstrap samples. In a data example, these approaches are also compared with Bayesian model averaging. Finally, some recommendations for the development of predictive models are given.
The identification of factors and models in predicting an outcome is of major interest in many areas of application. Often a large number of potential explanatory variables are collected, leading to a large set of candidate models, from which usually one single model is chosen for prediction. Here we consider in- or exclusion of candidate variables in a linear regression model as the only model building task. The model space would be extensively enlarged if further issues, such as determination of a functional form for a continuous variable or another type of regression model, would be considered. When a model is constructed from 15 variables, View the MathML source model combinations are possible. Ultimately, only one model will be selected. It is well known that often several models fit the data equally well, but may differ substantially in terms of the variables included and might lead to different predictions for individual observations (Miller, 2002). Efficient algorithms are available (Hofmann et al., 2007), but ignoring model selection uncertainty may lead to biased parameter estimates and underestimation of variance (Draper, 1995 and Chatfield, 1995). To account for model selection uncertainty, model averaging (MA) procedures have been proposed. The MA estimate is obtained as a weighted average of a set of estimated predictors obtained under different models. Advantages of MA are stressed in many papers, but usually the evidence is restricted to case studies (Hoeting et al., 1999, Volinsky et al., 1997 and Augustin et al., 2005) or some analytical results for restricted problems, such as a very small set of candidate models, independence between predictors, assuming a local asymptotic framework (Buckland et al., 1997, Candolo et al., 2003, Yuan and Yang, 2005 and Hjort and Claeskens, 2003). Over the past few years, a lot of work has been done in a Bayesian model averaging (BMA) framework (Hoeting et al., 1999 and Raftery et al., 1997). An alternative to BMA is bootstrap model averaging (bootstrapMA), first proposed by Buckland et al. (1997) but modified by Augustin et al. (2005) to include a variable screening step prior to bootstrap model averaging in order to identify and eliminate variables with no or a negligible effect on the outcome. This results in a much smaller class of candidate models for the MA step. For problems with a larger number of variables (say more than 10) the importance of a screening step seems to be well accepted. With BMA, Occam's window is usually used (Hoeting et al., 1999). Burnham and Anderson (2002) argue for a selection of models based on subject matter knowledge and Yuan and Yang (2005) propose in their ARMS algorithm to keep the top m models (in examples they use m=40) based on Akaike information criterion (AIC) or Bayes information criterion (BIC) in one part of the data. In contrast to other screening approaches which eliminate models not strongly supported by the data, we eliminate variables not strongly supported by the data. The main reason is the increase of potential future use of our models, which means that it is not required to collect all variables in a new data set. A simulation study showed that our screening step reduces the number of variables and correspondingly the number of candidate models, without eliminating models strongly supported by the data (Sauerbrei et al., 2006). The first promising results of bootstrapMA could be shown in a small simulation study (Holländer et al., 2006). Here we will present details of the design and the simulation study will be substantially extended. Using MSE and coverage rate as criteria, we will compare the predictive performance of models derived with bootstrapMA to the full model, backward elimination (BE) using AIC and BIC and the model with the highest selection probability in bootstrap samples. In a study on school children, the aim is to predict forced vital capacity (FVC) from 24 variables. In this example we will compare results from bootstrapMA with the others and also with BMA (Hoeting et al., 1999). In all methods we restrict ourselves to only fitting linear terms in the models and in- or exclusion of variables only. We will introduce the model building approaches as well as the assessment of predictive ability in Section 2. In Section 3, we describe the design of our simulation study and present the results in Section 4. Section 5 gives an example for further illustration and comparisons. In Section 6, we will discuss the results and give some recommendations for the development of predictive models.
نتیجه گیری انگلیسی
Problems of the popular approach to derive a model in a data-dependent way, but ignoring the uncertainty of the model building process and base inference on a ‘conditional’ model are known for a long time. Breiman (1992) calls this a ‘quiet’ scandal. The importance to incorporate model selection uncertainty in the process of model building is stressed. Augustin et al. (2005) proposed a strategy which aims to use MA in problems with a large(r) (say >10) number of variables. This results in more than 1000 candidate models, often hundreds of thousands. The necessity of a screening step is obvious. In contrast to other screening steps in MA procedures (Hoeting et al., 1999, Burnham and Anderson, 2002 and Yuan and Yang, 2005), we propose to screen for variables hardly supported by the data. More popular is screening based on models hardly supported by the data. The latter concept ignores possible future uses as all original variables have to be measured, even if their contribution to the predictor is negligible. Comparing results of bootstrapMA with BMA in the ozone data we illustrate the advantages of our variable elimination approach. We present a simulation study comparing MA procedures with the full model and several procedures selecting a single model in the framework of realistic problems in model building. The design M1 with sample size 400 can be seen as a very simple selection problem, the design M3 is already extreme for N=100-size per predictor ratio is four, general recommendations see 10 as a minimum number (Harrell et al., 1984 and Peduzzi et al., 1995)—and all others are between. We incorporated some degree of multicollinearity and consider different amounts of explained variation. Variable selection is simpler and results are more similar for a small residual variation View the MathML source, but instability of model selection and differences between the approaches increase with larger variances (Breiman, 1996b). R2 is between 0.53 and 0.87, results can be partly ‘extrapolated’ to larger variances with smaller explained variation. However, for low R2 (say below 0.2) results are probably much worse. See for example a related investigation on selecting the correct model (Dell’Aquila and Ronchetti, 2006). For the simple situation (M1,400,2.5), results of all approaches are similar. Not much can be gained by eliminating unimportant variables (only three in the design), and important variables have a large power of being included. MA is unnecessary, practical issues may guide the preference between the full model and a single selected model. For M1 with a small sample size, selection of a single model with BIC is less sensible. AIC selects at a larger significance level, resulting in much more power to include weak effects. In all situations, the full model has an excellent coverage rate. For M1 the full model is a sensible approach, however, with a larger number of noise variables MSE is large. For a large sample size, BIC based approaches have smaller MSE than AIC based approaches, for a smaller sample size AIC is preferable because of the smaller probability of excluding variables with an effect. Because of theoretical results and problems introduced by variable selection, several researchers prefer the full model. However, variable selection is sensible with a larger number of candidate variables and a sufficient sample size. This is also illustrated in our example (see Section 5.3). Additionally, for continuous regressors it is important to consider whether non-linear effects are present. Non-linearity was no important issue in the ozone example (data not shown), but in general this is also an argument against simply estimating the regression coefficients from the full model. Instead, simultaneous selection of important variables in combination with the selection of a reasonable functional relationship for continuous variables is required (Royston and Sauerbrei, 2005). Estimating the variance of the predictor from one selected model results in an underestimation. Simulation shows that the MA approach can correct for it. In all situations, MA gives coverage rates close to the 95% nominal rate. Using a preliminary screening step before the MA step has only a minor influence. It even corrects for the too large coverage rate of MA in some situations. MSE of MA based approaches are often better and never worse than their counterparts from a single model. For situations with a larger number of variables, our simulation study gives empirical evidence of the superiority of MA approaches in statistical criteria as MSE and coverage rates. The screening step was intended to improve practical usefulness of MA approaches (Augustin et al., 2005). As it does not harm the statistical criteria, we consider the bootstrapMA approach with a variable screening step as a sensible way to improve prediction in regression models. Besides the screening step, there is a close similarity between bootstrapMA and the bagging estimator proposed by Breiman (1996a). The principal difference is that bagging uses the data from the bootstrap replications instead of the original data for the estimate View the MathML source. Therefore, in formulas (2) and (3) the weights wj have to be replaced by 1/B2. Holländer et al. (2006) extended this estimator by implementing the screening step. In a specific example and in a small simulation study, results of bootstrapMA were slightly better than results for bagging. The latter approach was not considered here. Even with our variable screening step, MA approaches lack easy interpretation and generalisability. Most likely, this is an important reason for their low acceptance in practical analysis. Giving more weight to interpretability and practical aspects we intend to consider even more rigorous variable screening, paying the price of worsening statistical criteria. Such an approach may help to identify a small number (say up to five) of ‘top’ models which predict the outcome. Investigating the stability of selecting simultaneously variables and functional forms for continuous predictors from a model class of 5760 candidate models, Royston and Sauerbrei (2003) identified through a detailed summary of 5000 bootstrap replications three ‘basic’ prognostic ‘top’ models for breast cancer patients. An alternative to bootstrapMA is BMA. However, defining suitable priors for the different models is a difficult task, especially with a large set of possible models. The simplest suggested solution is to use non-informative priors, which is problematic in the case of correlated explanatory variables. Another problem of the BMA approach is that large summations and integrals are necessary if the model space is large. Improvement of the latter is possible by transferring the variable screening concept to BMA. We would expect similar results for Bayesian and bootstrap MA approaches if correlation between explanatory variables is not very high. Then, preferences for bootstrapMA or BMA may more depend on philosophical thoughts than on differences between the predictors. For non-informative priors the similarity is obvious (Efron, 2005).