یک روش عددی برای کنترل بهینه ترکیبی بر اساس برنامه ریزی پویا
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|25690||2011||21 صفحه PDF||سفارش دهید||15782 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Nonlinear Analysis: Hybrid Systems, Volume 5, Issue 2, May 2011, Pages 254–274
This contribution extends a numerical method for solving optimal control problems by dynamic programming to a class of hybrid dynamic systems with autonomous as well as controlled switching. The value function of the hybrid control system is calculated based on a full discretization of the state and input spaces. A bound for the error due to discretization is obtained from modeling the error as perturbation of the continuous dynamics and the cost terms. It is shown that the bound approaches zero and that the value function of the discretized variant converges to the value function of the original problem if the discretization parameters go to zero. The performance of a numerical scheme exploiting the discretized system is illustrated for two different examples treated previously in literature.
The theory of optimal control of dynamic systems is mainly based on two principles, namely the maximum principle (MP) and dynamic programming (DP), see e.g. . The MP provides necessary conditions for optimal trajectories and co-states while the DP theory derives the value function as the solution of a partial differential equation. The optimal input trajectory is calculated for both cases by solving a static optimization problem along the evolution of the system. The main elements of this optimization problem, the co-states (MP) or the value function (DP) respectively, are calculated off-line. The co-states are valid only for a particular initial state, while the value function covers the whole state space. Thus, the evaluation of the necessary conditions of the MP results in an optimal open-loop control and applying the DP principle leads to optimal closed-loop control. Both theories have been extended to hybrid dynamical systems, which combine continuous dynamics with discrete events, as e.g. recently described in the survey paper . With respect to extending the MP to hybrid systems, necessary optimality conditions for the co-states at the switching times are derived in ,  and . The extension of DP to hybrid systems, as considered in  and , determines the value function as the solution of a set of partial differential equations. Algorithms exploiting these principles are confronted with a number of problems. For relatively “simple” classes of systems, like linear continuous dynamics and quadratic costs, approximate closed-loop solutions exist, see e.g.  for autonomous switching,  for controlled switching, or  dealing with discrete time linear hybrid systems. Algorithms for applying the MP to general hybrid optimal control problems, solve the optimal control problem for a given sequence of discrete states and repeat this in a combinatorial sense over all possible discrete state sequences (See ). Such procedures typically incur exponential complexity with respect to the number of discrete states. For DP, the main contribution to computational complexity results from the discretization of the continuous part of the dynamics. For example, the prototype algorithm described in  uses a discretization of the hybrid state space and computes an approximation of the value function for the discretized system. In , the calculation of the value function for the discretized system is formulated as a linear optimization problem. A number of approaches have been proposed to reduce the computational complexity of determining optimal controls for hybrid systems. For example in , optimality zones for particular dynamics are pre-computed by sampling and a brute force method, such that the remaining set of discrete sequences is smaller. In  and , branch and bound methods are used to prune the search tree of sequences. In , the discrete states are relaxed, leading to mixed continuous dynamics. Algorithms implementing an efficient calculation, like presented in  and , of the value function of hybrid systems are rare. The work proposed here considers a class of hybrid models with autonomous (forced) as well as controlled (optional) switching with possible state resets. The algorithm presented for the solution is based on the DP principle and on a full discretization of the hybrid optimal control problem. The base of the algorithm is a Semi-Lagrangian discretization scheme in combination with piecewise affine functions to approximate the value function. The scheme, which is motivated by the one presented in  and  for purely continuous systems, essentially discretizes in space and time to obtain a finite representation of the hybrid dynamics. The value function for the discretized system, which is interpreted as a perturbed discrete time process, is linearly interpolated over the continuous state space to obtain an approximation of the value function for the original system. While the basic algorithm has been presented already in , the main contribution of this paper is the convergence analysis for the general numerical scheme. For the convergence analysis, the state space discretization is interpreted as a perturbation of the continuous dynamics and the costs terms. The approach is stimulated by the work in , which models the numerical error of computing the solution of an optimal stopping problem for continuous systems as perturbation. The optimal control problem considered here is formulated as an exit time optimal control problem. As discussed later, this setting includes a large variety of different optimal control problems. Unlike an infinite time discounted optimal control problem (see ,  and ), the region of computation need not be invariant. The paper is organized as follows. After some preliminaries, the hybrid system is defined in Section 3, and the optimal control problem is introduced in Section 4. In Section 5 the discrete time process under perturbation is defined and estimates for the difference between the original and the perturbed value function are provided. The state space discretization, its characterization as pertubation, and the algorithm for solving the fully discretized problem are described in Section 6. Section 7 defines the control law based on the calculated value function, and the approach is demonstrated for two examples in Section 8. The paper concludes with a discussion of the performance, the limits, and possible extensions in Section 9.
نتیجه گیری انگلیسی
In this contribution, the discretization approach for dynamic programming was extended to hybrid systems with autonomous and controlled switching behavior. The state space discretization is modeled by a perturbation of the continuous dynamics and running costs. The convergence of the perturbed discrete time process to the true solution is ensured in the limit. The fact that hybrid dynamics is considered (rather than a purely continuous one) adds only moderate additional complexity to the discretization step inherent to all dynamic programming approaches for continuous-time and continuous-space systems. Thus, the approach is well suited for the extension to a very general class of hybrid systems. Of course, dynamic programming underlies in general an exponential growth of complexity with the number of continuous states, limiting the applicability to systems of rather low dimension. The results obtained for the considered examples are close to the results based on the maximum principle. Algorithms based on the maximum principle are considered to be very accurate because no state space discretization is needed in the calculations, but they produce only open-loop solutions, unlike the approach presented here. Future research focuses on the exploration of alternative approximations to obtain higher accuracy of the value function approximation by coarser grids. For example, for purely continuous systems, a first step in this direction is presented in .