There has accumulated a large amount of literature on confidence interval construction
involving lognormal data owing to the fact that many data in scientific inquiries may be
approximated by this distribution. Procedures have usually been developed in a piecemeal
fashion for a single mean, a single mean with excessive zeros, a difference between two
means, and a difference between two differences (net health benefit). As an alternative, we present a general approach for all these cases that requires only confidence limits available in introductory texts. Simulation results confirm the validity of this approach. Examples arising from health economics are used to exemplify the methodology.
The lognormal distribution may be used to approximate right skewed data arising in a wide range of scientific
inquires (Limpert et al., 2001). Traditional statistical analysis of such data has usually been focused on the means
of log-transformed data, resulting in inferences expressed in terms of geometric means rather than the arithmetic
means. However, there are many situations, including in environmental science (Parkhurst, 1998) and in occupational
health research (Rappaport and Selvin, 1987), in which arithmetic means may provide more meaningful information.
Consequently, there has accumulated a relatively large amount of literature regarding statistical methods for this type of
data, including Aitchison and Brown (1957) and Crow and Shimizu (1988), with more articles being added rapidly to the
literature (Chen, 1994; Taylor et al., 2002; Wu et al., 2002, 2003, 2006; Gill, 2004; Tian and Wu, 2006; Shen et al., 2006;
Krishnamoorthy et al., 2006; Bebu and Mathew, 2008; Fletcher, 2008).
Since many health cost data may be positively skewed (Thompson and Barber, 2000; Briggs et al., 2002), the literature
dealing with the analysis of lognormal data in this context has also increased substantially. This includes procedures for
a one sample mean, a difference between two independent sample means, a difference between two dependent sample
means, and additional zero values for each of these cases (Zhou, 2002). Recent advances include a method based on the
Edgeworth expansion (Zhou and Dinh, 2005; Dinh and Zhou, 2006). It is worthwhile to note that this approach not only
fails to provide adequate coverage rates but also lacks invariance in the sense that a confidence interval for
We have demonstrated that interval estimation involving lognormal data requires only the application of confidence
interval procedures found in introductory textbooks. Thus, it may be unnecessary to avoid lognormal assumptions forsimplicity (Nixon and Thompson, 2005, p. 1226), or to rely on simulation of pivotal statistics (Krishnamoorthy and Mathew,
2003; Tian, 2005; Chen and Zhou, 2006; Krishnamoorthy et al., 2006).
Interval estimation based on transformations using the Edgeworth expansion (Zhou and Dinh, 2005) lacks the invariance
property of a confidence interval for a difference, and performs poorly for lognormal data (Zhou and Dinh, 2005; Dinh and
Zhou, 2006). One may argue that this procedure is nonparametric and thus it is unfair to compare it with the MOVER. Our
position is that to be nonparametric, a procedure must be able to provide valid results for data having a common distribution,
such as the lognormal in the present context.
Although we have deliberately used examples from health economics, the approach described here is also suitable for
lognormal data arising from such disciplines as economics and environmental science (Rappaport and Selvin, 1987; Crow
and Shimizu, 1988; Krishnamoorthy et al., 2006; Fletcher, 2008; Zou et al., 2009b). We should also mention that the MOVER
we described here can be readily applied to lognormal regression models (Bradu and Mundlak, 1970; El-Shaarawi and
Viveros, 1997; Wu et al., 2006; Tian and Wu, 2007b; Shen and Zhu, 2008), because in concept this problem is identical
to that of producing confidence intervals for lognormal means.
As a final note, we should emphasize that the procedures in Section 3 are only applicable to lognormal data. Whenever
there is apparent evidence against the lognormal assumption, the procedures presented in this section should not be used.
However, this is not an inherent deficiency of the MOVER. This point is supported by several applications and extensions
beyond lognormal data (Zou and Donner, 2008). Zou (2008) presents a further extension to interval estimation for measures
of additive interaction, which are functions of risk ratios having only asymptotic lognormal distributions. New applications
to set confidence limits for differences between Pearson correlations and coefficients of determination (R2) may be found
in Zou (2007), where it has been identified that the MOVER fails in the case of very small increments in R2.