دانلود مقاله ISI انگلیسی شماره 24572
ترجمه فارسی عنوان مقاله

شناسایی بیزی نقاط دورافتاده خوشه ای در رگرسیون چندگانه

عنوان انگلیسی
Bayesian identification of clustered outliers in multiple regression
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24572 2007 13 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computational Statistics & Data Analysis, Volume 51, Issue 8, 1 May 2007, Pages 3955–3967

ترجمه کلمات کلیدی
سوئیچینگ رگرسیون - حداقل متوسط ​​مربع - نمونه گیبس - کلانشهر - هیستینگز الگوریتم -
کلمات کلیدی انگلیسی
Switching regression, Least median of squares, Gibbs samplers, Metropolis–Hastings algorithm,
پیش نمایش مقاله
پیش نمایش مقاله  شناسایی بیزی نقاط دورافتاده خوشه ای در رگرسیون چندگانه

چکیده انگلیسی

We propose a Bayesian model for clustered outliers in multiple regression. In the literature, outliers are frequently modeled as coming from a subgroup where the variance of the errors is much larger than in the rest of the data. By contrast, when a cluster of outliers exists, we show that it can be more informative to model them as coming from a subgroup where different regression coefficients hold. We can explicitly model the clustering phenomenon by assuming that the probability of an outlier is a function of the explanatory variables. Fitting proceeds via the Gibbs sampler, using the Metropolis–Hastings algorithm to produce variates from the more unusual distributions. Initialization uses a least median of squares fit, and in some ways this method can be viewed as a Bayesian version of the many algorithms that use this fit as a start to some more efficient estimator. This method works very well in a variety of test data sets. We illustrate its use in a data set of sailboat prices, where it yields information both on the identity of the outliers and on their location, spread, and the regression coefficients inside the minority subgroup.