The traditional least squares estimators used in multiple linear regression model are very sensitive to design anomalies. To rectify the situation we propose a reparametrization of the model. We derive modified maximum likelihood estimators and show that they are robust and considerably more efficient than the least squares estimators besides being insensitive to moderate design anomalies.
The motivation for this paper comes from Puthenpura and Sinha (1986) who give examples in the context of off-line identification. They show that the commonly used least square estimators (LSE) of the parameters in a multiple linear regression model are not efficient if the data is very noisy. They also show that the maximum likelihood equations are very sensitive to gross errors (outliers) and have convergence problems. As pointed out by Puthenpura and Sinha (1986, p. 231), outliers in data can occur due to large disturbances, data transmission errors, failures in transducers and A/D converters, etc. They work out modified maximum likelihood estimators (MMLE) from data replicated at each design point and show that the estimators are robust to outliers, a very desirable property. See also Aström (1980), Sayed, Nascimento, and Capparrone (2002) and Subramanian and Sayed (2004) who reflect on the importance of robustness. In working out their estimators, Puthenpura and Sinha censor a certain proportion of extreme observations as in Tiku (1978). However, replications at each design point are often not available. Also, the estimators of σσ based on censored samples can have substantial downward bias (Tiku, 1980). We work out MMLE from complete samples when the noise is non-Gaussian and only one observation is available at each design point as in most situations. We show that our estimators are efficient and robust to outliers and other data anomalies.
The methodology of modified maximum likelihood estimation originated with Tiku, 1967 and Tiku, 1989 and Tiku and Suresh (1992) and has been used extensively (Puthenpura and Sinha, 1986, Schneider, 1986, Tiku and Akkaya, 2004, Tan and Tabatabai, 1988, Tiku et al., 1986 and Vaughan, 2002). Another difficulty with the LSE is that their variances are profoundly influenced by the design values xijxij(1⩽i⩽n,1⩽j⩽q)(1⩽i⩽n,1⩽j⩽q) and the fact is that the design values are not always pre-determined. If there are outliers (Tiku, 1977) in the design values, the diagonal elements in (X′X)-1(X′X)-1 will be small and the LSE of the regression coefficients View the MathML sourceθj(1⩽j⩽q) will appear to be efficient. If there are inliers (Akkaya & Tiku, 2005) in the design values, the diagonal elements in (X′X)-1(X′X)-1 will be large and the LSE will appear to be inefficient. This is easier to see when q=1q=1. To rectify the situation, we suggest a reparametrization of the model. We show that the LSE for this model are insensitive to design anomalies. We derive the MMLE and show that they are robust (in particular to outliers) and considerably more efficient than the LSE besides being insensitive to design anomalies.