برنامه ریزی پویا عصبی گسسته در کنترل ربات چرخ دار
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|25699||2011||8 صفحه PDF||سفارش دهید||3696 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Communications in Nonlinear Science and Numerical Simulation, Volume 16, Issue 5, May 2011, Pages 2355–2362
In this paper we propose a discrete algorithm for a tracking control of a two-wheeled mobile robot (WMR), using an advanced Adaptive Critic Design (ACD). We used Dual-Heuristic Programming (DHP) algorithm, that consists of two parametric structures implemented as Neural Networks (NNs): an actor and a critic, both realized in a form of Random Vector Functional Link (RVFL) NNs. In the proposed algorithm the control system consists of the DHP adaptive critic, a PD controller and a supervisory term, derived from the Lyapunov stability theorem. The supervisory term guaranties a stable realization of a tracking movement in a learning phase of the adaptive critic structure and robustness in face of disturbances. The discrete tracking control algorithm works online, uses the WMR model for a state prediction and does not require a preliminary learning. Verification has been conducted to illustrate the performance of the proposed control algorithm, by a series of experiments on the WMR Pioneer 2-DX.
The WMRs are used for complex transport or inspection tasks, where presence of human operator is economically unjustified or can cause unnecessary danger for the life. From mathematical point of view, the WMRs are non-holonomic objects described by nonlinear dynamic equations, what results in problems with the synthesis of stable control algorithms. In recent years we can observe intensive development of effective artificial intelligence (AI) methods, as NNs , fuzzy logic, or reinforcement learning (RL) algorithms , , , , , ,  and  in synthesis of the complex nonlinear control algorithms. In the presented article the discrete tracking control algorithm, with Neural Dynamic Programming (NDP) algorithm in a form of the advanced model-based ACD in DHP , , ,  and  configuration is proposed. The ACD algorithm consists of two structures realized in a form of RVFL NN : the actor approximates the optimal control law and implements current control policy, the critic rates quality of the control signal by approximation of the derivative of the value function with respect to the states and passes feedback to the actor, which accordingly changes its policy. The presented discrete control algorithm does not require the preliminary learning, works online and uses the WMR model for the state prediction in DHP structure. Stability of the control system is achieved by the additional supervisory control element derived from the Lyapunov stability theory , which guarantees stability in the ACD NNs learning phase and robustness in a face of disturbances. Verification of the proposed control algorithm was realized on the WMR Pioneer 2-DX. The results of the researches presented in the article continue authors earlier works related to synthesis of the WMR tracking control algorithms using NNs , fuzzy logic or RL methods ,  and . The paper is organized as follows: Section 1 is a short introduction into the tracking control of the WMR problem, Section 2 presents the discrete model of the WMR dynamics with executive systems dynamics involved. Section 3 contains the synthesis of the discrete tracking control algorithm of actor-critic architecture. Section 4 contains the stability analysis, in Section 5 results of the verification experiments realized on the WMR Pioneer 2-DX are presented. Section 6 summarizes the research project.
نتیجه گیری انگلیسی
In the presented paper the discrete tracking control algorithm with advanced ACD in DHP configuration, is presented. DHP consists of two adaptable structures implemented as NN: the actor and the critic. In the proposed control algorithm the overall control signal is composed of the DHP control signal, the PD controller control signal, the supervisory term control signal and the Ye control signal. The supervisory term guaranties the stable realization of the tracking movement in the NNs learning phase and robustness in the face of disturbances. The discrete tracking control algorithm works online, uses the WMR model for the state prediction in the DHP structure and does not require the preliminary learning. Verification of the proposed control algorithm has been conducted on the WMR Pioneer 2-DX. On the basis of the obtained experiments results we have noticed higher quality of the tracking control for the control algorithm with the actor-critic structure in the comparison with only the PD controller. The values of errors are bounded and are being reduced during the experiment from values obtained at the beginning of the experiment to values near zero. The values of the actor-critic structure NNs’ weights are bounded and converge to the fixed values during the adaptation process.