دانلود مقاله ISI انگلیسی شماره 24166
ترجمه فارسی عنوان مقاله

رگرسیون خطی تعمیم یافته PLS

عنوان انگلیسی
PLS generalised linear regression
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24166 2005 30 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computational Statistics & Data Analysis, Volume 48, Issue 1, 1 January 2005, Pages 17–46

ترجمه کلمات کلیدی
حداقل مربعات جزئی - رگرسیون گام به گام - انتخاب متغیر - رگرسیون اصلاح شده
کلمات کلیدی انگلیسی
Partial least squares, Stepwise regression, Variable selection, Modified PLS regression,
پیش نمایش مقاله
پیش نمایش مقاله  رگرسیون خطی تعمیم یافته PLS

چکیده انگلیسی

PLS univariate regression is a model linking a dependent variable y to a set X={x1,…,xp} of (numerical or categorical) explanatory variables. It can be obtained as a series of simple and multiple regressions. By taking advantage from the statistical tests associated with linear regression, it is feasible to select the significant explanatory variables to include in PLS regression and to choose the number of PLS components to retain. The principle of the presented algorithm may be similarly used in order to yield an extension of PLS regression to PLS generalised linear regression. The modifications to classical PLS regression, the case of PLS logistic regression and the application of PLS generalised linear regression to survival data are studied in detail. Some examples show the use of the proposed methods in real practice. As a matter of fact, classical PLS univariate regression is the result of an iterated use of ordinary least squares (OLS) where PLS stands for partial least squares. PLS generalised linear regression retains the rationale of PLS while the criterion optimised at each step is based on maximum likelihood. Nevertheless, the acronym PLS is kept as a reference to a general methodology for relating a response variable to a set of predictors. The approach proposed for PLS generalised linear regression is simple and easy to implement. Moreover, it can be easily generalised to any model that is linear at the level of the explanatory variables.

نتیجه گیری انگلیسی

The PLS regression algorithm that has been re-formulated above shows several advantages: (1) Classical PLS regression is directly linked to the usual procedures for simple and multiple regressions and is therefore enriched by the classical testing procedures of such methods. In this framework, the statistical tests aim at identifying those explanatory variables that do not significantly contribute to the construction of PLS components and, consequently, with a low explanatory power on the response variable. A PLS components is judged as not significant when there is no explanatory variable with a significant weight. In its construction. In the example of Section 2, the proposed approach led to the same variable selection as backward stepwise classical PLS regression. The approach might be further validated on a wider variety of examples also in comparison with other approaches to variable selection in PLS recently proposed in Forina et al. (1999), Gauchi and Chagnon (2001), Höskuldsson (2001), Lazraq et al. (2003), Lingren et al. (1994) and Sarabia et al. (2001). (2) In practice, when a strong degree of multicollinearity shows up, stepwise multiple regression is commonly used. The inconvenience of this method is the elimination of explanatory variables strongly correlated to the response variable and thus important for the user. On the contrary, PLS regression allows to retain in the model all variables with a stronger explanatory power. (3) In case of missing data, PLS components are computed in accordance with the NIPALS algorithm. However, PLS components are correlated in this case. The original PLS regression algorithm does not consider this feature. On the contrary, the new formulation let the correlation between PLS components play a role as multiple regression is used. (4) There is an immediate extension to generalised linear regression. Some preliminary results were already obtained in PLS logistic regression Esposito Vinzi and Tenenhaus (2001) and in survival data with the Cox PLS model Bastien and Tenenhaus (2001).