انتخاب بین توزیع وایبل و لگ نرمال: یک مطالعه شبیه سازی تطبیقی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|9931||2008||9 صفحه PDF||سفارش دهید||5640 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computational Statistics & Data Analysis, , Volume 53, Issue 2, 15 December 2008, Pages 477-485
How to select the correct distribution for a given set of data is an important issue, especially when the tail probabilities are of interest as in lifetime data analysis. The Weibull and lognormal distributions are assumed most often in analyzing lifetime data, and in many cases, they are competing with each other. In addition, lifetime data are usually censored due to the constraint on the amount of testing time. A literature review reveals that little attention has been paid to the selection problems for the case of censored samples. In this article, relative performances of the two selection procedures, namely, the maximized likelihood and scale invariant procedures are compared for selecting between the Weibull and lognormal distributions for the cases of not only complete but also censored samples. Monte Carlo simulation experiments are conducted for various combinations of the censoring rate and sample size, and the performance of each procedure is evaluated in terms of the probability of correct selection (PCS) and average error rate. Then, previously unknown behaviors and relative performances of the two procedures are summarized. Computational results suggest that the maximized likelihood procedure can be generally recommended for censored as well as complete sample cases.
Choosing the correct or best-fitting distribution for a given set of data is an important issue, especially when the tail probability, which is usually sensitive to the assumed model, is of interest as in reliability engineering. Statistical methods for distribution choice include probability plotting (Nelson, 1982), goodness-of-fit (GOF) testing, hypothesis testing (HT), and selection procedures. Probability plotting provides useful information concerning the right distribution. However, it can be subjective and may yield multiple plots which appear to be equally adequate. Concerning the usual GOF and HT approaches, one may prefer the latter when one may not wish to reject the distribution in the null hypothesis without an alternative to take its place. However, the HT approach treats the two distributions asymmetrically, while selection procedures put candidate distributions on equal footings and can deal with more than two distributions at the same time. Many authors developed selection procedures with respective selection statistics and decision rules. The two well-known procedures are those that are respectively based on the maximized likelihood function (MLF) and scale invariant (SI) selection statistic. For the MLF-based procedure, the reader is referred to Bain and Engelhardt (1980), Kappenman (1982), Gupta and Kundu, 2003 and Gupta and Kundu, 2004, Kundu and Manglick (2004), Kundu et al. (2005), Strupczewski et al. (2006), and Kundu and Raqab (2007), among others. The SI selection procedure was developed in Quesenberry and Kent (1982) for selecting among the exponential, Weibull, lognormal, and gamma distributions. In both procedures, the distribution with the largest value of the selection statistic is selected. For other selection procedures, the reader is referred to Croes et al. (1998), Marshall et al. (2001), Cain (2002), Dick (2004) and Mitosek et al. (2006), among others. In addition, Bayesian selection procedures were developed by Kim et al. (2000), Upadhyay and Peshwani (2003), Araújo and Pereira (2007), etc. Several authors compared the relative performance of selection procedures. Siswadi and Quesenberry (1982) compared the SI, scale-shape invariant (SSI), and MLF procedures when complete data are available and the SI and MLF procedures when the data are Type-I censored. For other comparative studies, the reader is referred to Kappenman (1989), Pandey et al. (1991), Lee and Pope (2006), Mitosek et al. (2006), and Basu et al. (2008). A review of the related literature reveals that little attention has been paid to the selection problems for the case of censoring. Exceptions include Siswadi and Quesenberry (1982) for Type-I censored samples, Croes et al. (1998), Cain (2002) and Kim et al. (2000) for Type-II censored samples, and Block and Leemis (2008) for randomly right-censored samples. In this paper, the ratio of the maximized likelihoods (RML) and SI procedures are compared for discriminating between the Weibull and lognormal distributions for the cases of complete, Type-I censored, and Type-II censored samples (The RML procedure is equivalent to the MLF procedure in the case of two distributions). The Weibull and lognormal distributions are most often assumed and competing with each other in analyzing lifetime data (e.g., see Bain (1978), Chen et al. (1987) and Prendergast et al. (2005) for Time Dependent Dielectric Breakdown (TDDB) data, and Lloyd (1979) and Pinto (1991) for electromigration lifetime data). In such cases, it is highly desirable to have a means to discriminating between the two distributions since there is a significant difference in the low percentiles which are of interest in lifetime data analysis. In addition, lifetime data are usually censored due to the constraint on the amount of testing time, which necessitates an extensive comparative study of selection procedures for the case of censoring. Siswadi and Quesenberry (1982) compared the SI and MLF procedures for Type-I censored samples, but only considered the case where the sample size is 30 and the expected censoring rate is 10%. As can be seen in the simulation results in Section 3, considering some specific cases only could be misleading in understanding the whole behaviors of the two procedures. In this paper, the performances of the RML and SI procedures are evaluated in terms of the probability of correct selection (PCS) for various simulated data sets with different sample sizes and censoring rates. The rest of this article is organized as follows. In Section 2, the two procedures are described in detail, and Section 3 shows the comparison results obtained from the simulation experiments. Finally, conclusions and guidelines are presented in Section 4.
نتیجه گیری انگلیسی
Relative performances of the RML and SI procedures are compared for selecting between the Weibull and lognormal distributions. Not only complete but also censored sample cases are considered for various combinations of n and c using Monte Carlo simulation. Notable behaviors of the two selection procedures, observed in the present study, but not recognized in the previous works, include: (1) the decrease in the View the MathML source as n increases if c is about 30% or larger; (2) in the Type-I censoring case, the decrease in the View the MathML source up to a certain n if c is about 40% or larger; (3) the tendency of the SI procedure to favor the Weibull distribution for moderate to heavily censored cases if n exceeds a certain threshold; and (4) the tendency of the RML procedure to favor the lognormal distribution for censored samples for all combinations of n and c considered. As for the relative performance of the RML and SI procedures, computational results indicate that: (1) for complete samples, the RML procedure yields balanced PCS values for both distributions, although View the MathML source’s of the two procedures are similar; and (2) for censored samples, if c is about 20% or less, View the MathML source’s of the two procedures are not appreciably different. However, under heavier censoring and for relatively large n’s, View the MathML source of the RML procedure becomes noticeably smaller than that of the SI procedure. In addition, the SI procedure involves additional time-consuming simulation runs to evaluate the last term on the right-hand side of Eq. (6). In summary, the RML procedure is recommended to select between the Weibull and lognormal distributions for censored as well as complete samples with an understanding that it (and any other selection procedures) may require a large n and/or large number of failures for censored cases to lower View the MathML source and that it tends to favor the lognormal distribution especially in the case of Type-I censoring. In reliability engineering, the problem of a large n and/or large number of failures could be alleviated by employing an accelerated life test. Further investigation is needed to explain why the PCS of the RML procedure when the true distribution is lognormal decreases up to a certain sample size and then increases under moderate to heavy Type-I censoring. In addition, various extended Weibull and lognormal distributions have recently appeared in the literature (e.g., see Murthy et al. (2004), Al-Saleh and Agarwal (2006), Pham and Lai (2007), Sultan et al. (2007) and Zhang and Xie (2007) for extended Weibull distributions; and Flynn (2004), Flynn (2005), Chen (2006), and Vera and Díaz-García (2008) for extended lognormal distributions). They generally fit the data better than the two-parameter distributions, although the difference in fits to the data could be insignificant (Alqam et al., 2002) or may depend on the selection criterion adopted (Lu et al., 2002). This suggests that it would be a fruitful area of future research to extend the present study to the cases of those extended distributions and/or other selection procedures than the RML or SI procedure.