روش LP برای اصول برنامه ریزی پویا برای مسائل کنترل تصادفی با محدودیت های دولتی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|25874||2013||15 صفحه PDF||سفارش دهید||10610 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Nonlinear Analysis: Theory, Methods & Applications, Volume 77, January 2013, Pages 59–73
We study a class of nonlinear stochastic control problems with semicontinuous cost and state constraints using a linear programming (LP) approach. First, we provide a primal linearized problem stated on an appropriate space of probability measures with support contained in the set of constraints. This space is completely characterized by the coefficients of the control system. Second, we prove a semigroup property for this set of probability measures appearing in the definition of the primal value function. This leads to dynamic programming principles for control problems under state constraints with general (bounded) costs. A further linearized DPP is obtained for lower semicontinuous costs. MSC 93E20; 49L25
To our best knowledge, the constrained optimal control problem with continuous cost was studied for the first time in . The value function of an infinite horizon control problem with space constraints was characterized as a continuous solution to a corresponding Hamilton–Jacobi–Bellman equation. For discontinuous cost functionals, the deterministic control problem with state constraints was studied in ,  and  using viability theory tools. The DPP is rather easily proven in the deterministic framework or in the discrete-time setting, whenever one deals with finite probability spaces. For regular cost functionals and general probability spaces, the stochastic dynamic programming principle (without constraints) has been extensively studied (e.g.  and ). In general, dynamic programming principles are rather difficult to prove if the regularity of the value function is not known a priori. One should guarantee a priori measurability properties and employ technical arguments on measurable selections. In discontinuous settings, an alternative to the classical method is to use a weak formulation (cf. ). This method has been recently employed in  to provide a dynamic programming principle for stochastic optimal control problems with expectation constraints. The idea in  is to replace the value function by a test function. This allows to avoid measurability issues and it is rather natural in the context of viscosity theory. Both in classical and weak DPPs, a key ingredient is the so-called “stability under concatenation” property (cf. A3 in ). Although trivially satisfied for several applications, this is not true for all classes of admissible controls. For example, in the case of piecewise deterministic Markov processes (cf. ), one uses piecewise open loop controls which do not enjoy the concatenation property. Linearization techniques provide a way of avoiding the assumption of stability under concatenation (cf. ). These techniques are very similar to the diffusion setting on which we focus in the present paper. The aim of the present paper is to provide linearized formulations for the general control problem with state constraints and deduce linearized formulations of the dynamic programming principles in the discontinuous framework. Linear programming tools have been efficiently used to deal with stochastic control problems (see , , , ,  and  and references therein). An approach relying mainly on Hamilton–Jacobi(–Bellman) equations has been developed in  for deterministic control systems. This approach has been generalized to controlled Brownian diffusions (cf.  for infinite horizon, discounted control problems and  for Mayer and optimal stopping problems). The control processes and the associated solutions (and, eventually stopping times) can be embedded into a space of probability measures satisfying a convenient condition. This condition is given in terms of the coefficient functions. Using Hamilton–Jacobi–Bellman techniques, it is proven in  and  that minimizing continuous cost functionals with respect to the new set of constraints leads to the same value. This approach has the advantage that it allows to extend the control problems to a discontinuous setting. Moreover, these formulations turn out to provide the generalized solution of the (discontinuous) Hamilton–Jacobi–Bellman equation. For further details, the reader is referred to  and  and references therein. We begin by recalling the main results in the unconstrained stochastic framework in Section 2.2. The results for continuous cost functions (see Theorem 1) allow to characterize the set of constraints defining the primal linearized formulation as the closed convex hull of occupational measures associated to controls. We briefly recall the linear formulations of the value in the case of lower and upper semicontinuous cost functionals and specify the connections between classical value function, primal and dual values in Theorem 4. These results are taken from . In Section 2.3, we consider the case when the solution is constrained to some closed set KK. It appears natural to modify the linearized value function by only minimizing with respect to probability measures whose support is (included in) KK. Whenever the cost functions are bounded and lower semicontinuous, the linearized value function can also be obtained as the limit of (classical) penalized problems (cf. Theorem 11). We provide a dual formulation similar to the unconstrained framework (Theorem 11). The dual formulation links the value function to the associated HJB equation on the set of constraints KK. Furthermore, if standard convexity conditions hold true, the linearized value function coincides with the standard weak formulation under state constraints. This is consistent with the results in the unconstrained framework (cf. ). Using the characterization of the set of constraints in the linear formulation, we prove a semigroup property (Section 3.1). This property follows naturally from the structure of the sets of constraints. We derive dynamic programming principles under state constraints for general bounded cost functionals in Section 3.2. In the bounded case, an abstract DPP is given (cf. Theorem 18, assertion 1). In the lower semicontinuous setting, we provide a further linearized programming principle (Theorem 18, assertion 2). This is just a first step in studying the equations that would characterize the general value function. It suggests that the test functions in the definition of the viscosity solution might be defined on the space of probability measures.