ترجمه فارسی عنوان مقاله

یادگیری تقویت بر اساس تکرار ارزش عمومی برای حل مسئله کنترل ردیابی بهینه از سیستم های غیر خطی وابسته به مداوم

عنوان انگلیسی

General value iteration based reinforcement learning for solving optimal tracking control problem of continuoustime affine nonlinear systems

کد مقاله	سال انتشار	تعداد صفحات مقاله انگلیسی
105792	2017	34 صفحه PDF

منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Neurocomputing, Volume 245, 5 July 2017, Pages 114-123

ترجمه کلمات کلیدی

برنامه ریزی پویا سازگار، کنترل بهینه، تقویت یادگیری، سیستم های مداوم،

کلمات کلیدی انگلیسی

Adaptive dynamic programming; Optimal control; Reinforcement learning; Continuousâtime systems;

دانلود رایگان 2 صفحه اول مقاله لاتین (PDF)

پیش نمایش مقاله

چکیده انگلیسی

In this paper, a novel reinforcement learning (RL) based approach is proposed to solve the optimal tracking control problem (OTCP) for continuousâtime (CT) affine nonlinear systems using general value iteration (VI). First, the tracking performance criterion is described in a total-cost manner without a discount term which can ensure the asymptotic stability of the tracking error. Then, some mild assumptions are assumed to relax the restriction of the initial admissible control in most existing references. Based on the proposed assumptions, the general VI method is proposed and three situations are considered to show the convergence with any initial positive performance function. To validate the theoretical results, the proposed general VI method is implemented by two neural networks on a nonlinear springâmassâdamper system and two situations are considered to show the effectiveness.