دانلود مقاله ISI انگلیسی شماره 25875
ترجمه فارسی عنوان مقاله

یک الگوریتم برنامه ریزی تکرار شونده تطبیقی ​​پویا برای کنترل بهینه سیستم های غیر خطی زمان گسسته ناشناخته با ورودی های محدود

عنوان انگلیسی
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
25875 2013 12 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Information Sciences, Volume 220, 20 January 2013, Pages 331–342

ترجمه کلمات کلیدی
برنامه ریزی پویا تطبیقی​​ - برنامه ریزی پویا تقریبی - محدودیت های کنترل - برنامه ریزی اکتشافی جهانی دوگانه - شبکه های عصبی - کنترل بهینه -
کلمات کلیدی انگلیسی
Adaptive dynamic programming, Approximate dynamic programming, Control constraints, Globalized dual heuristic programming, Neural networks, Optimal control,
پیش نمایش مقاله
پیش نمایش مقاله   یک الگوریتم برنامه ریزی تکرار شونده تطبیقی ​​پویا برای کنترل بهینه سیستم های غیر خطی زمان گسسته ناشناخته با ورودی های محدود

چکیده انگلیسی

In this paper, the adaptive dynamic programming (ADP) approach is employed for designing an optimal controller of unknown discrete-time nonlinear systems with control constraints. A neural network is constructed for identifying the unknown dynamical system with stability proof. Then, the iterative ADP algorithm is developed to solve the optimal control problem with convergence analysis. Two other neural networks are introduced for approximating the cost function and its derivatives and the control law, under the framework of globalized dual heuristic programming technique. Furthermore, two simulation examples are included to verify the theoretical results.

مقدمه انگلیسی

The nonlinear optimal control has been the focus of control fields for many decades [8] and [16]. It often needs to solve the nonlinear Hamilton–Jacobi–Bellman (HJB) equation. For instance, the discrete-time HJB (DTHJB) equation is more difficult to work with than the Riccati equation because it involves solving nonlinear partial difference equations. Although dynamic programming has been a useful technique in handling optimal control problems for many years, it is often computationally untenable to perform it to obtain the optimal solutions [4]. Effective techniques have been employed to construct learning systems [22], [20], [37], [19], [35], [3], [12] and [11]. Characterized by strong abilities of self-learning and adaptivity, artificial neural networks (ANN or NN) are also a functional tool to implement learning control [33], [15], [13] and [34]. Additionally, they are often used to carry out universal function approximation in adaptive/approximate dynamic programming (ADP) algorithms. The ADP method was proposed by Werbos [33] and [34] to deal with optimal control problems forward-in-time. There were several synonyms used for ADP, including “adaptive critic designs” [21], “adaptive dynamic programming” [30] and [17], “approximate dynamic programming” [34], [24] and [2], “neuro-dynamic programming” [5], “neural dynamic programming” [23], and “reinforcement learning” [6]. In recent years, ADP and related research have gained much attention from researchers [1], [2], [5], [6], [9], [10], [14], [17], [18], [21], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [34] and [36]. According to [21] and [34], ADP approaches were classified into several main schemes: heuristic dynamic programming (HDP), action-dependent HDP (ADHDP), also known as Q-learning, dual heuristic dynamic programming (DHP), ADDHP, globalized DHP (GDHP), and ADGDHP. Al-Tamimi et al. [2] proposed a greedy HDP iteration algorithm to solve the DTHJB equation of optimal control of discrete-time affine nonlinear systems. Abu-Khalaf and Lewis [1], Vrabie and Lewis [27], and Vamvoudakis and Lewis [25] investigated the continuous-time nonlinear optimal control problems based on the idea of ADP. With the increasing complexity of industry processes, the data-based method has achieved great interest among control engineers. It does not need to build accurate mathematical models of controlled plants and thus has significant practical value. Kim and Lewis [14] presented a model-free H∞ control design scheme for unknown linear discrete-time systems via Q-learning, which was expressed in the form of linear matrix inequality. Campi and Savaresi [7] proposed a virtual reference feedback tuning approach which was in fact a data-based method. In this paper, we solve the constrained optimal control problem of unknown discrete-time nonlinear systems based on the iterative ADP algorithm via GDHP technique (i.e., iterative GDHP algorithm). An NN model is constructed as an identifier to learn the unknown controlled plant. Then, the iterative ADP algorithm is introduced to solve the DTHJB equation with convergence proof. Next, the optimal controller can be designed by employing the GDHP technique. This paper is organized as follows: In Section 2, the optimal control problem and the DTHJB equation are recalled for discrete-time nonlinear systems. In Section 3, we first design an NN identifier for unknown controlled system with stability proof. Then, the optimal control scheme based on the iterative ADP algorithm is developed with convergence analysis. In Section 4, the implementation of iterative ADP algorithm is presented through NN-based GDHP technique. In Section 5, two numerical examples are given to demonstrate the effectiveness of the proposed optimal control scheme. In Section 6, concluding remarks are given.

نتیجه گیری انگلیسی

An iterative ADP algorithm is developed in this paper for near optimal control of unknown discrete-time nonlinear systems with control constraints. The GDHP technique is employed to perform the algorithm, with three NNs constructed to approximate the cost function and its derivatives, the control law, and the unknown controlled system, respectively. The numerical examples demonstrate the validity of the control scheme. Since the tracking problem is another important topic of control engineering, it is necessary to expand the developed approach to solve the optimal tracking control problem in the future. Additionally, considering the fact that existing results about tracking control mainly aim at affine nonlinear systems, our future work will focus on dealing with the nonaffine case.