توزیع درآمد و اندازه گیری نابرابری: مشکل ارزش گرایی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
11206 | 2007 | 29 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Econometrics, Volume 141, Issue 2, December 2007, Pages 1044–1072
چکیده انگلیسی
We examine the statistical performance of inequality indices in the presence of extreme values in the data and show that these indices are very sensitive to the properties of the income distribution. Estimation and inference can be dramatically affected, especially when the tail of the income distribution is heavy, even when standard bootstrap methods are employed. However, use of appropriate semiparametric methods for modelling the upper tail can greatly improve the performance of even those inequality indices that are normally considered particularly sensitive to extreme values.
مقدمه انگلیسی
There is a folk wisdom about inequality measures concerning their empirical performance. Some indices are commonly supposed to be particularly sensitive to specific types of change in the income distribution and may be rejected a priori in favour of others that are presumed to be “safer”. This folk wisdom is only partially supported by formal analysis and it is appropriate to examine the issue by considering the behaviour of inequality measures with respect to extreme values. An extreme value is an observation that is highly influential on the estimate of an inequality measure. It is clear that an extreme value is not necessarily an error or some form of contamination. It could in fact be an informative observation belonging to the true distribution—a high-leverage observation. In this paper, we study sensitivity of different inequality measures to extreme values, in both cases of contamination and of high-leverage observations. What is a “sensitive” inequality measure? This issue has been addressed in ad hoc discussion of individual measures in terms of their empirical performance on actual data (Braulke, 1983). Some of the welfare-theoretical literature focuses on transfer sensitivity (Shorrocks and Foster, 1987) and related concepts. But it is clear that informal discussion is not a satisfactory approach for characterising alternative indices; furthermore, the welfare properties of inequality measures in terms of the relative impact of transfers at different income levels will not provide a reliable guide to the way in which the measures may respond to extreme values. We need a general and empirically applicable tool. Specifically we need to address four key issues that are relevant to assessing the sensitivity of inequality measures: (1) influence functions and their performance in presence of contamination; (2) sensitivity to high-leverage observations; (3) error in probability of rejection in tests with finite samples; (4) sensitivity under different underlying distributions/shapes of tails. The paper will provide general results and simulation studies for each of these four topics using a variety of common inequality indices, in order to yield methods that are implementable in practice. In Section 2, we examine the sensitivity of inequality measures to contamination in the data, both in high and low incomes. In Section 3, we study the sensitivity of inequality measures to “high-leverage” observations. We investigate Monte Carlo simulations to study the error in the rejection probability (ERP) of a test in finite samples. Section 4 examines the relationship between the apparent sensitivity of the inequality index and the shape of the income distribution. Section 5 proposes a method for detecting extreme values in practice and Section 6 concludes
نتیجه گیری انگلیسی
Very large incomes matter both in principle and practice when it comes to inequality judgments. This is true both in cases where the extreme values are genuine observations and where they represent some form of data contamination. But practical methods that appropriately take account of the problems raised by extreme values are still relatively hard to come by. In this paper we have demonstrated a practical way of detecting the potential problem of sensitivity to extreme values9—see Section 5. However, our analysis of the relative performance of inequality indices can be used to make four broader points that may assist in the development of empirical methods of inequality analysis. First, semiparametric inequality measures—i.e. inequality measures based on a parametric-tailed estimation of the income distribution—are much less sensitive to contamination than those based directly on the EDF. This is true even where relatively unsophisticated methods are used for estimating the distribution that is used in the tail. Second, bootstrap methods are often useful, but some bootstrap methods can be catastrophically misleading. The bootstrap certainly works better than asymptotic methods but, given the typical heavy-tailed shape of the income distribution, the standard bootstrap often performs badly: indeed for some important cases the standard bootstrap is actually invalid.10 This negative conclusion applies to several commonly used inequality measures. However, the problem can be overcome by using a non-standard bootstrap; in particular the semiparametric bootstrap outperforms the other methods and gives accurate inference in finite samples. Third, in situations where semiparametric inequality measures can be used, they perform well in asymptotic tests and at least as well as semiparametric bootstrap methods. Fourth, empirical researchers sometimes want to select an appropriate inequality measure on the basis of its performance with respect to extreme values: our analysis throws some light on the argument here. For example it has been suggested that the Gini coefficient is going to be less prone to the influence of outliers than some of the alternative candidate inequality indices. As might be expected the Gini coefficient is indeed less sensitive than GE indices to contamination in high incomes. However, in terms of performance in finite samples there is little to choose between the Gini coefficient and the GE index with α=0 (or equivalently the Atkinson index with ɛ=1); there is also little to choose between the Gini and the logarithmic variance. This is always true for estimation methods using the EDF; but if one uses semiparametric methods then one has an even stronger result. In such cases the apparent empirical advantage of the Gini coefficient of the alternatives virtually disappears.