ترکیب مدل های غیر پارامتری با رگرسیون لجستیک: یک برنامه کاربردی برای داده ها آسیب موتور خودرو
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24701||2000||16 صفحه PDF||سفارش دهید||6533 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computational Statistics & Data Analysis, Volume 34, Issue 3, 28 September 2000, Pages 371–386
To date, computer-intensive non-parametric modelling procedures such as classification and regression trees (CART) and multivariate adaptive regression splines (MARS) have rarely been used in the analysis of epidemiological studies. Most published studies focus on techniques such as logistic regression to summarise their results simply in the form of odds ratios. However flexible, non-parametric techniques such as CART and MARS can provide more informative and attractive models whose individual components can be displayed graphically. An application of these sophisticated techniques in the analysis of an epidemiological case-control study of injuries resulting from motor vehicle accidents has been encouraging. They have not only identified potential areas of risk largely governed by age and number of years driving experience but can also identify outlier groups and can be used as a precursor to a more detailed logistic regression analysis.
A common problem of most practical research is to assess relationships among a set of variables. A primary statistical tool is regression analysis which may be used to evaluate the relationship of one or more covariates or predictor variables x1,…,xn to a single (continuous or binary/ordinal) response variable y. It is most often used when the predictor variable cannot be controlled as when collected in a sample survey or other observational study. However, regression analysis may also be applied to controlled experimental situations. The most common situations where regression analysis is appropriate include problems where one wishes to capture the joint predictive relationship of y on a small subset of x1,…,xn in the form where ε is an additive stochastic component with zero expectation. Existing methods of regression analysis range from the simple linear and polynomial regression to logistic regression ( Cox, 1970; Hosmer and Lemeshow, 1989), generalised additive modelling ( Hastie and Tibshirani, 1990) and recently include methods such as classification and regression trees (CART) ( Breiman et al., 1984) and a sophisticated, flexible regression technique, multivariate adaptive regression splines (MARS) ( Friedman, 1991) based on recursive partitioning strategies. Examples of MARS models being used in practice are limited. This is mainly due to the computational requirements of fitting a MARS model and the complexity of the resulting fit. An area which yields some applications using MARS is in medicine ( Friedman and Roosen, 1995; Gill et al., 1996). Other fields taking some interest in the MARS methodology is data mining ( Stone et al., 1997) where problems of dealing with large datasets arise and spatial problems such as oceanography, where modelling sea ice distribution in the southern ocean was of primary interest ( De Veaux et al., 1993a). Other papers have focussed on comparisons between MARS and other non-linear regression methods ( Frank, 1995; Marshall et al., 1994) and neural networks ( De Veaux et al., 1993b; Ripley, 1994; Cas and Stone, 1996). MARS has even been applied to some time series applications providing improvements over existing techniques ( Lewis and Stevens, 1991). A comprehensive tutorial on applying MARS to a calibration problem in chemistry is also worth reading for a gentle introduction to the technique ( Sekulic and Kowalski, 1992).