دانلود مقاله ISI انگلیسی شماره 24121
ترجمه فارسی عنوان مقاله

# بازنمونه گیری برای چک کردن مدل های رگرسیون خطی از طریق برآورد رگرسیون ناپارامتری

عنوان انگلیسی
Resampling for checking linear regression models via non-parametric regression estimation
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24121 2000 21 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computational Statistics & Data Analysis, Volume 35, Issue 2, 28 December 2000, Pages 211–231

ترجمه کلمات کلیدی
آزمون فرضیه - تخمین غیر پارامتریک - مدل های رگرسیون - سری های زمانی - خود راه انداز
کلمات کلیدی انگلیسی
Hypothesis testing, Non-parametric estimators, Regression models, Time series, Bootstrap,
ترجمه چکیده
In a general context of dependent data, we have examined two bootstrap tests to check that the regression function follows a general lineal model. The results of this study show a better behavior of the bootstrap tests with respect to the test obtained from the asymptotic distribution of the functional distance defined in (5), for small or moderate sample sizes. Both bootstrap tests have a similar behavior although the WB test is a little bit more conservative than the NB test. Also, the two bootstrap tests have similar computational cost, being the WB a little faster. A drawback of the proposed bootstrap tests is that the computation of the NB test needs one smoothing parameter and that of the WB test needs two. The study of methods for selection of these parameters would be very interesting. Taking into account the arguments above, we recommend to use the Naive Bootstrap test to check the linearity of an unknown regression function for moderate sample sizes and homoskedasticity assumption. In the computation of this test we can use, as smoothing parameter, the value h0 at which the function D(h) presents a local minimum. If the sample size is very large, the test based in the asymptotic distribution could provides good results. In any case, it is very important to take into account the existence of correlation when we apply the tests.
پیش نمایش مقاله

#### چکیده انگلیسی

Let us consider the fixed regression model, and assume that the random errors, {εt}, follow an ARMA-type dependence structure. The purpose of this paper is to study the application of the bootstrap test to check that the unknown regression function, m, follows a general linear model of the type: with A being a functional of in . In a previous paper, González-Manteiga and Vilar-Fernández (1995) proposed a test, , based on the Crámer–von-Mises-type functional distance, where is a Gasser–Müller-type non-parametric estimator of m, and is a member of the family that is closest to . In this work, two bootstrap algorithms are considered, where the dependence structure of the errors is taken into account. A broad simulation study and an applied example show the good behavior of the bootstrap test.

#### نتیجه گیری انگلیسی

Let us consider the regression model equation(1) where xt∈C, with a compact set in the unknown regression function, and {εt}t=1n a sequence of unobserved zero mean random variables. In the last few years, several hypothesis tests have been developed for testing equation(2) versus a general alternative hypotheses of the type: H1: “m is a function with a certain degree of smoothness”. Given an initial sample {(xt,Yt)}t=1n, almost all of these tests are based on one distance between a non-parametric pilot estimator, and one parametric estimator, of m under H0, denoted by . If this discrepancy is statistically significant, hypothesis H0 is rejected. Otherwise, it is accepted. Among the interesting recent papers that address this problem we can cite those by Firth et al. (1991), Kozek (1991), Eubank and Hart (1993), Eubank et al. (1993), Härdle and Mammen (1993) and Stute and González-Manteiga (1996), where different non-parametric pilot estimators are used (kernel, spline, etc.). In this work, we devote attention to the goodness of fit for linear regression models. That is, for type (1) models we wish to test the hypothesis equation(3) with respect to the alternative given in (2), where A is a functional of of . The study is carried out taking into account that the errors, εt, are dependent. It often happens when analyzing economical data samples, growth curves and, in general, whenever the observations are sequentially gathered in time. It is important to take into account the existence of the correlation among the errors when the model is statistically analyzed. To ignore this fact causes inefficiency in the parametric estimation of the model ( Seber and Wild, 1989), in the non-parametric estimation ( Chu and Marron, 1991), and it also affects the power of the goodness-of-fit test used, as we will later show in the simulation study. The dependence structure in the errors for the goodness-of-fit problem, was considered for the first time in a previous paper by González-Manteiga and Vilar-Fernández (1995). In their work, a test based on a discrepancy between a non-parametric estimator of m (of Gasser and Müller, 1979 type) equation(4) and one parametric, was considered. Denoting as usual a kernel function, and h>0 the smoothing parameter. The estimator minimizes the functional equation(5) where ω is a weight function in order to prevent boundary effects of the kernel estimator and is the empirical distribution function over the points of the design. Now, in a natural way, we can use as a discrepancy measure between the null hypothesis and the alternative equation(6) where Ψ is the Crámer–von-Mises-type functional distance defined in (5). In Theorem 1 of González-Manteiga and Vilar-Fernández (1995), the asymptotic normality of and D is obtained under the hypothesis that the errors follow a MA(∞) dependence structure. They deduce that equation(7) with and , where γ(k) is the order k autocovariance function of the process of errors {εi} and ∗ denotes the convolution operator. Using the limit distribution (7), hypothesis H0 is rejected with a significance level of α when equation(8) zα being such that Φ(zα)=1−α, with Φ the distribution function of the standard normal. In practice, expression (8) cannot be used, as it depends on two unknown parameters: σε2 and σD2 which would have to be estimated from the sample. A “plug in” version of the test can be obtained with the estimation of these parameters. In both cases, the speed of convergence to the normal in (7) is very slow and, for the usual values of the smoothing parameter h≈n−1/5, the convergence speed obtained is of order n−1/10 (see Härdle and Mammen (1993) González-Manteiga and Vilar-Fernández (1995) for more details on this aspect). Very large sample sizes for the “plug-in” test are then necessary. In order to solve this problem, in this work, we study two bootstrap algorithms to approximate the distribution of , in one alternative way to the normal approximation. In Section 2, we describe the two resampling algorithms under one ARMA-type dependence in the errors. In Section 3, we present a broad simulation study to compare the tests with normal critical region (with known theoretical parameters or with estimated parameters) with the tests obtained from the two bootstrap algorithms. In this study, several aspects of interest are considered, such as the selection of the smoothing parameters, the influence of the error's structure, the sample size and the parameters of the model. In Section 4, the proposed tests are applied in a numerical example. Finally, we present the conclusions of our study in Section 5.