دانلود مقاله ISI انگلیسی شماره 105819
ترجمه فارسی عنوان مقاله

توزیع شده و غیرقطبی استادی تصادفی با کاهش واریانس

عنوان انگلیسی
Distributed and asynchronous Stochastic Gradient Descent with variance reduction
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
105819 2018 32 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Neurocomputing, Volume 281, 15 March 2018, Pages 27-36

ترجمه کلمات کلیدی
تبخیر تصادفی، کاهش واریانس، پروتکل ارتباط آسنکرون، الگوریتم های یادگیری ماشین توزیع شده،
کلمات کلیدی انگلیسی
Stochastic Gradient Descent; Variance reduction; Asynchronous communication protocol; Distributed machine learning algorithms;
پیش نمایش مقاله
پیش نمایش مقاله  توزیع شده و غیرقطبی استادی تصادفی با کاهش واریانس

چکیده انگلیسی

Stochastic Gradient Descent (SGD) with variance reduction techniques has been proved powerful to train the parameters of various machine learning models. However, it cannot support the distributed systems trivially due to the intrinsic design. Although conventional studies such as PetuumSGD perform well for distributed machine learning tasks, they mainly focus on the optimization of the communication protocol, which does not exploit the potential benefits of a specific machine learning algorithm. In this paper, we analyze the asynchronous communication protocol in PetuumSGD, and propose a distributed version of variance reduced SGD named DisSVRG. DisSVRG adopts the variance reduction technique to update the parameters in a model. After that, those newly learned parameters across nodes are shared by using the asynchronous communication protocol. Besides, we accelerate DisSVRG by using the adaptive learning rate with an acceleration factor. Additionally, an adaptive sampling strategy is proposed in DisSVRG. The proposed methods greatly reduce the wait time during the iterations, and accelerate the convergence of DisSVRG significantly. Extensive empirical studies verify that DisSVRG converges faster than the state-of-the-art variants of SGD, and gains almost linear speedup in a cluster.