Let Y be a response variable and Z=(Z1,Z2,…,Zr)′Z=(Z1,Z2,…,Zr)′ be a random vector of covariates. Assume that Y and ZZ follow a linear regression model
equation(1)
Y=b′Z+ε,Y=b′Z+ε,
Turn MathJax on
where b=(b1,b2,…,br)′b=(b1,b2,…,br)′ is the vector of parameters and εε, uncorrelated with ZZ, is the error term with mean zero and variance View the MathML sourceσε2. When the response Y is subject to random right censoring, we only observe (Ui,Zi,δi)(Ui,Zi,δi), i=1,2,…,ni=1,2,…,n, which are n replications of (U,Z,δ)(U,Z,δ), where U=min(Y,C)U=min(Y,C), δ=I[Y⩽C]δ=I[Y⩽C] and C is the censoring random variable which is independent of the response Y . Note the censoring works both ways: if δ=0δ=0, then the response Y is censored by the censoring variable C ; if δ=1δ=1, on the other hand, the censoring variable C is censored by the response Y . The censored regression problems focus on estimating the parameter vector bb and investigating the related statistical properties of the estimators based on the observations (Ui,Zi,δi)(Ui,Zi,δi), i=1,2,…,ni=1,2,…,n.
Many techniques have been proposed for handling the above regression problems. One methodology in this aspect is based on the synthetic data and uses the ordinary least squares procedure to obtain an estimator of bb. It includes Miller's estimator ( Miller, 1976), Buckley and James’ estimator ( Buckley and James, 1979, James and Smith, 1984 and Jin et al., 2006), Leurgans’ estimator ( Leurgans, 1987 and Zhou, 1992a) and in this paper the so-called KSV estimator ( Koul et al., 1981 and Srinivasan and Zhou, 1994; Fygenson and Zhou, 1992 and Fygenson and Zhou, 1994; Lai et al., 1995). Another estimation method is the weighted least squares estimate (abbreviated as WLS in this paper) ( Stute, 1993, Stute, 1996 and He and Wong, 2003). In the former methodology, the KSV approach is the easiest to be carried out because no iterations are required and standard least squares computer routines can be used once the observations of the response are transformed by the censoring information. Similar advantages are also achieved by the WLS method. Although much work has been devoted to the study of the statistical properties such as consistency and asymptotic normality, respectively, for these two methods (see above references), it makes sense to compare their finite sample performance. This paper serves this purpose by means of an extensive simulation study and an analysis of the Stanford heart transplant data.
The remains of this paper are organized as follows. In Section 2, both the KSV and WLS procedures for linear regression models with censored data are briefly described and some theoretical properties of the estimators such as consistency and asymptotic normality are also summarized. Extensive simulations are conducted in Section 3 to compare the finite sample performance of these two estimators under various censoring patterns and underlying distributions of the covariates and error term. In Section 4, the famous Stanford heart transplant data are analyzed by these two methods and the results are compared with those obtained by Miller and Halpern (1982). The paper then concludes with a brief summary and discussion.