برنامه ریزی پویا دیفرانسیل تطبیقی برای کنترل بهینه چند هدفه
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24843||2002||13 صفحه PDF||سفارش دهید||9261 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Automatica, Volume 38, Issue 6, June 2002, Pages 1003–1015
An efficient numerical solution scheme entitled adaptive differential dynamic programming is developed in this paper for multiobjective optimal control problems with a general separable structure. For a multiobjective control problem with a general separable structure, the “optimal” weighting coefficients for various performance indices are time-varying as the system evolves along any noninferior trajectory. Recognizing this prominent feature in multiobjective control, the proposed adaptive differential dynamic programming methodology combines a search process to identify an optimal time-varying weighting sequence with the solution concept in the conventional differential dynamic programming. Convergence of the proposed adaptive differential dynamic programming methodology is addressed.
Many real-world control problems are characterized by their multiple performance measures that are often noncommensurable and conflict with each other. During the past three decades the development of multiobjective control (Zadeh, 1963; Salukvadze, 1979; Toivonen & Mäkilä, 1989; Li, 1990; Khargonekar & Rotea, 1991; De Nicolao & Locatelli, 1992; Li 1993a and Li 1993b; Carvalho & Ferreira, 1995) has grown by leaps and bounds. Based on the principle of optimality, multiobjective dynamic programming (Brown & Strauch, 1965; Yu & Seiford, 1981; Li and Haimes 1987 and Li and Haimes 1989) is a powerful solution methodology in solving multiobjective control problems. The curse of dimensionality in dynamic programming prevents its direct adoption in many real-world large-scale control problems. Reducing the dimensionality in dynamic programming (Larson & Korsak, 1970; Morin & Esogbue, 1974; Bertsekas & Castañon, 1989; Haurie & L'Ecuyer, 1986; Johnson, Stedinger, & Shoemaker, 1993; Mayne, 1966; Jacobson & Mayne, 1970; Yakowitz & Rutherford, 1984; Liao & Shoemaker, 1991; Luus, 1998) has been a challenging research task in front of the control and optimization community for many years. The curse of dimensionality of dynamic programming is further aggravated in situations with a vector-valued objective function. There is, however, no report in the literature on efficient numerical algorithms for multiobjective optimal control problems with a general separable structure. We consider in this paper the following general class of multiobjective optimal control problems: (P) equation(1) equation(2) where , and . In problem (P), all performance indices , are assumed to be backward separable. More specifically, there exist functions , and , such that each Ji(X,U) can be expressed by a backward nested form: equation(3) where ∂yt[i]/∂yt+1[i]>0 for all i=1,…,k,t=0,1,…,T−1. (This assumption ensures the satisfaction of “the monotonicity condition”.) Eq. (3) can be rewritten in a compact form: equation(4) where and We assume in this paper that all functions have continuous second-order derivatives. Problem formulation in (P) covers all types of multiobjective control problems with a separable structure in the sense of multiobjective dynamic programming. Different performance indices, , in (P) may take different function forms and the composite operations in a performance index Ji may vary from stage to stage. General separable performance indices that are not stagewise additive often result from optimization problems in areas such as minimax control, reliability optimization, multi-reservoir system, chemical engineering process, and mathematical programming, see, for example, Tauxe, Inamn, and Mades (1979), Mine and Fukushima (1979), Li (1995) and Luus (1997). Scalar-valued dynamic programming with a general separable performance index is studied in Nemhauser (1966) and Furukawa and Iwamoto (1976). Multiobjective dynamic programming with general separable performance indices is studied in Yu and Seiford (1981) and Li and Haimes (1987). There are some special features in this general class of multiobjective control problems. When some functions of Ji's are of nonlinear stagewise forms, an attempt of scalarization to convert problem (P) into a scalar-valued optimal control problem will lead to a formulation that is nonseparable in the sense of dynamic programming. In addition, the approach using state augmentation will also fail when some performance indices are backward separable, but not forward separable. Multiobjective optimal control problem (P) with a general separable structure can be solved by using the envelope approach ( Li & Haimes, 1987), a multiobjective dynamic programming method. When an analytical solution form is not obtainable, the numerical implementation of the envelope approach with discretization in computation is not mathematically tractable, or even infeasible when the problem dimension is large. The limitation of the envelope approach comes about from discretization, thus the curse of dimensionality of dynamic programming. This dimensionality problem becomes more serious in multiobjective dynamic programming than in the single-objective dynamic programming, since instead of recording the scalar cost-to-go and its associated optimal control, the vector-valued cost-to-go and its associated set of noninferior solutions have to be recorded for each grid of the state at each stage. In this paper, we aim at developing an efficient numerical solution method, entitled adaptive differential dynamic programming, for problem (P). This new solution method extends the idea, concept, and the solution scheme of differential dynamic programming ( Mayne, 1966; Jacobson & Mayne, 1970; Yakowitz & Rutherford, 1984; Liao & Shoemaker, 1991) to multiobjective dynamic programming. The dimensionality problem in multiobjective dynamic programming will be overcome in the proposed new method since no discretization is involved. The presentation of this paper is as follows. Some basic properties of problem (P) will be discussed in the next section. Adaptive differential dynamic programming will be developed in Section 3 for (P). The convergence analysis of adaptive differential dynamic programming will be carried out in Section 4. Numerical implementation of adaptive differential dynamic programming is shown in Section 5. The paper concludes in Section 6 with some concluding remarks.
نتیجه گیری انگلیسی
The advantage of adopting dynamic programming comes from its inherent efficiency arisen from decomposition and its power to generate a globally optimal feedback policy from embedding. For some special classes of multiobjective control problems, such as convex control problems with a stagewise additive vector-valued objective function, there does not exist an essential difference between the vector- and scalar-valued control problems in terms of the solution scheme. The solution scheme is simply to convert the original vector-valued control problem into a scalar-valued optimal control problem by aggregating multiple performance indices into a weighted overall performance index. In contrast to multiobjective continuous-time control problems, a vector-valued objective function may have different composition operators in its components and the composition operator may vary from stage to stage in multiobjective discrete-time control problems. The major challenge in multiobjective dynamic programming comes from these multiobjective control problems with general separable structures. In order to cope with the time-varying weighting vector among multiple objectives, one key task is to calculate the vector-valued cost-to-go for multiobjective control problems with general separable structures. This solution scheme, however, will be often trapped into the curse of dimensionality. The adaptive differential dynamic programming proposed in this paper overcomes the problem of dimensionality in multiobjective dynamic programming, and has thus provided a prominent tractable numerical solution algorithm for multiobjective control. We should point out the price we pay for adopting a differential dynamic programming scheme, instead of the primal version of dynamic programming. The solution generated by the adaptive differential dynamic programming (and its predecessor, differential dynamic programming) is, in general, a local optimum and is not of a full form of feedback policy which could become indispensable in uncertain situations. A future research direction is to extend the reach of adaptive differential dynamic programming to multiobjective dynamic optimization problems which are not R+k-convex via integrating it with the convexification schemes reported recently in Li (1996) and Li and Biswal (1998). Like some optimal control codes for scalar-objective discrete-time systems ( Fisher & Jennings, 1992; Teo, Goh, & Wong, 1991), general computer codes can be developed in the future for multiobjective control problems with a general separable structure. A computer-aided generation of a representative subset of the noninferior solutions for a multiobjective control problem is indispensable for a decision maker to select his/her best preferred solution for the final implementation.