سنتز بهینه کنترل عصبی بر اساس تجزیه متعامد از یک فرآیند راکتور شیمیایی با استفاده از برنامه ریزی پویا تقریبی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24863||2003||10 صفحه PDF||سفارش دهید||6091 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Neural Networks, Volume 16, Issues 5–6, June–July 2003, Pages 719–728
The concept of approximate dynamic programming and adaptive critic neural network based optimal controller is extended in this study to include systems governed by partial differential equations. An optimal controller is synthesized for a dispersion type tubular chemical reactor, which is governed by two coupled nonlinear partial differential equations. It consists of three steps: First, empirical basis functions are designed using the ‘Proper Orthogonal Decomposition’ technique and a low-order lumped parameter system to represent the infinite-dimensional system is obtained by carrying out a Galerkin projection. Second, approximate dynamic programming technique is applied in a discrete time framework, followed by the use of a dual neural network structure called adaptive critics, to obtain optimal neurocontrollers for this system. In this structure, one set of neural networks captures the relationship between the state variables and the control, whereas the other set captures the relationship between the state and the costate variables. Third, the lumped parameter control is then mapped back to the spatial dimension using the same basis functions to result in a feedback control. Numerical results are presented that illustrate the potential of this approach. It should be noted that the procedure presented in this study can be used in synthesizing optimal controllers for a fairly general class of nonlinear distributed parameter systems.
Process control problems are mostly governed by partial differential equations (PDEs) and are infinite-dimensional in nature. They are also called Distributed Parameter Systems (DPS), as opposed to ‘lumped parameter systems’ that characterize physical systems such as a car or an airplane. The DPS appear naturally in various application areas such as chemical processes, thermal processes, vibrating structures, fluid flow systems, etc. They inherently have an infinite number of system modes. Since it is impossible to deal with all the modes, some sort of approximation technique is usually applied for the analysis and synthesis procedures related to DPS. Infinite dimensional operator theory is used in (Curtain & Zwart, 1995) to find closed form solutions for control. However, this approach is difficult to apply to nonlinear DPS. So far this technique has been mainly limited to linear systems (Curtain & Zwart, 1995) and spatially invariant systems (Bameih, 1997). Moreover, the infinite-dimensional control solution needs to be approximated (like truncating an infinite series) for implementation and hence is not completely free from error. This approach of control synthesis is usually known as ‘design-then-approximate’. Another approach is ‘approximate-then-design’. Here, the PDEs describing the system dynamics are first approximated to yield a finite-dimensional structure. This approximate system is then used for controller synthesis. In this approach, it is relatively easy to design controllers using various concepts. An interested reader can refer to (Burns & King, 1994) for discussions on the relative merits and limitations of the two approaches. A popular approximate-then-design technique is to find orthogonal basis functions for the PDE solutions and use them with a Galerkin procedure to first create an approximate finite-dimensional lumped parameter model, i.e. a system of Ordinary Differential Equations (ODEs). This model is then used for control design using various tools of lumped parameter control design. If arbitrary basis functions (e.g. Fourier and Chebyshev polynomials) are used in the Galerkin procedure, they can result in a high-dimensional ODE system (Sadek & Bokhari, 1998). A better and powerful basis function design is obtained when the Proper Orthogonal Decomposition (POD) technique is used with a Galerkin approximation. In the POD technique, a set of problem-oriented basis functions is first obtained by generating a set of ‘snap-shot solutions’ through simulations or from the actual process. Using these orthogonal basis functions in a Galerkin procedure, a low-dimensional ODE system can be developed. This technique has widely been used in recent years (Armaou and Christofides, 2002, Banks et al., 2000, Burns and King, 1994, Christofides, 2001, Holmes et al., 1996, Padhi and Balakrishnan, 2002, Ravindran, 1999 and Zheng et al., 2002). The issue of optimal control synthesis should be addressed next. It is well known that the dynamic programming formulation offers the most comprehensive solution to nonlinear optimal control; however, a huge amount of computational and storage requirements are needed to solve the associated Hamilton–Jacobi–Bellman (HJB) equation (Bryson & Ho, 1975) (also known as the Bellman equation). Werbos (1992) proposed a means to get around this numerical complexity by using ‘approximate dynamic programming’ (ADP) formulations. His methods approximate the original problem with a discrete formulation. The solution to the ADP formulation is obtained through the two-neural network adaptive critic approach. In one version of the adaptive critic approach called the dual heuristic programming (DHP) one network called the action network represents the mapping between the state variables of a dynamic system and control and the second network, called the critic, outputs the costates with the state variables as its inputs. This ADP process, through the nonlinear function approximation capabilities of neural networks, overcomes the computational complexity that plagued the dynamic programming formulation of optimal control problems. More important, this solution can be implemented on-line, since the control computation requires a few multiplications of the network weights which are trained off-line. This technique was used in Balakrishnan and Biega (1996) to solve an aircraft control problem in a domain of interest. Note that there are various types of adaptive critic designs available in literature. An interested reader can refer to Prokhorov and Wunsch (1997) for more details. Recently, the techniques of POD and the adaptive critic design have been combined by the authors of this study to develop an innovative computational tool for the optimal control synthesis of nonlinear DPS (Padhi & Balakrishnan, 2002). Besides good numerical results from simulations, this technique has been successfully verified in an experimental setting as well, with a relatively simple heat diffusion problem ( Prabhat, Balakrishnan, Look, & Padhi, 2003). In this paper, this technique is applied to a more challenging nonlinear chemical reactor process. This dispersion type tubular chemical reactor control problem has been discussed in Choe and Chang, 1998 and Hlavacek and Hofmann, 1970. Possible applications of such reactors include conversion of synthetic (or natural) gas into higher hydrocarbons, carbonization of biomass, synthesis of alkyl-tert-alkyl ester, hydro-formation of C2–C25 olefins, etc. Choe and Chang (1998) have used Green's function to calculate optimal control. Their method for calculating the costate variables that arise in an optimal control formulation is complicated. Even though the authors have found a Green's function for the particular problem, finding an appropriate Green's function and calculating its coefficients is not an easy task in general. More important, their solution is for specific initial condition (initial state profiles) only. In other words, it is an open loop control which will severely degrade the process performance if the initial profile were different. In contrast to this, the approach presented in this paper is applicable to a large number of initial conditions (or profiles). Once the neural networks are trained to capture the relationship between state and control within a domain of interest (which is done off-line) they can be used to compute the control for ‘any’ value of the state variables within that domain. Moreover since using a set of networks is not computationally intensive it can be implemented on-line. In control terminology this is a feedback solution; a feedback control is desirable because of its beneficial properties like robustness with respect to noise suppression and modeling uncertainties. We wish to point out that we have solved the same chemical reactor optimal control problem using a different approach earlier (Padhi & Balakrishnan, 2001). In that approach which did not use any model reduction techniques, a controller was used at every step in a finite difference scheme (Gupta, 1995) that was used to discretize the spatial variable. Even though we obtained satisfactory results, there are some implementation issues. Note that one has to take a large number of node points for good finite difference approximations. However, because for each node point critic and action networks were proposed, the number of networks grows with the number of grid points and this would lead to serious problem in training of the networks. As a consequence one has to remain contented with a ‘coarse grid approximation’. In contrast, the current approach is grid independent in the sense that lumped parameter state vector does not depend on the number of grid points assumed for the integral evaluations. Second, in the earlier technique the state (and control) values at some point in space other than the node point locations are unknown. If one wants to get value for such a location, interpolation techniques are necessary. The prediction may not be good if the grid approximation is coarse. In contrast, in the proposed methodology by definition the basis functions are supposed to be continuous functions. So values at any point in the space can theoretically be computed without resorting to any interpolation technique. This issue is of significantly less concern in our new approach, since one can have a fine-grid approximation to begin with and therefore, will result in much smaller interpolation errors. Third, in the earlier approach, the state variables used for training were generated with an ∞-norm based procedure. This poses a problem when the grid is refined as the state profiles tend to become more and more discontinuous which does not happen in practice. In contrast, in this paper we propose a method based on L2 norm (both on state profiles as well as on its spatial derivative(s)), leading to the generation of ‘smooth’ state profiles for training. Besides capturing the expected state profiles more closely, this process aids the neural network training process substantially as unwanted (discontinuous) state profiles are not considered for training.
نتیجه گیری انگلیسی
Combining the techniques of POD and adaptive critic design, we have successfully synthesized an optimal controller for a nonlinear dispersion-type tubular reactor process. Simulation results are promising. The desired adiabatic steady state profiles are reached quickly (in about 50% of the resident time). This increases the conversion efficiency of the reactor. More important, the controller is able to drive a large number of initial state profiles in the domain of interest towards the desired profiles. For this reason the synthesized action neural network embeds the optimal control solution in a state feedback form, which is highly desired in practical implementation. The technique presented in this paper can also be viewed as a general computational tool for the optimal control of nonlinear DPS. In other words, the procedure of synthesizing the networks remains the same. Only the relevant state, costate and optimal control equations change depending on the problem under consideration.