رویکرد تقویت یادگیری به منظور تنظیم هدف در سیستم تولید خود تکاملی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|3678||2012||8 صفحه PDF||سفارش دهید||6320 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 39, Issue 10, August 2012, Pages 8736–8743
Up-to-date market dynamics has been forcing manufacturing systems to adapt quickly and continuously to the ever-changing environment. Self-evolution of manufacturing systems means a continuous process of adapting to the environment on the basis of autonomous goal-formation and goal-oriented dynamic organization. This paper proposes a goal-regulation mechanism that applies a reinforcement learning approach, which is a principal working mechanism for autonomous goal-formation. Individual goals are regulated by a neural network-based fuzzy inference system, namely, a goal-regulation network (GRN) updated by a reinforcement signal from another neural network called goal-evaluation network (GEN). The GEN approximates the compatibility of goals with current environmental situation. In this paper, a production planning problem is also examined by a simulation study in order to validate the proposed goal regulation mechanism.
Up-to-date market dynamics has been strongly demanding flexibility and responsiveness from the manufacturing systems. Actually, the manufacturing enterprises strive to facilitate continuous and quick adaptation to the constantly varying customer requirements and the competitive environment. The conventional manufacturing systems, however, are not suited for the ever-changing environment because of their rigid organizational structure and preoccupied static goals (Frayret et al., 2004, Heragu et al., 2002 and Renna and Ambrico, 2011). In order to ensure their competitive power, the manufacturing systems should have an advanced capability to dynamically organize their production resources and autonomously formulate their goals. Furthermore, a manufacturing system is required to have a capability of facilitating self-evolution according to its environmental situation (Shin, Mun, & Jung, 2009a). In keeping with this line of thought, various multi-agent systems (MAS) have been proposed in the literature, including MetaMorph (Maturana, Shen, & Norrie, 1999), MetaMorph II (Shen, Maturana, & Norrie, 2000), PROSA (Van Brussel, Wyns, Valckenaers, Bongaerts, & Peeters, 1998), ADACOR (Leitão & Restivo, 2006), and FrMS (Ryu, Yücesan, & Jung, 2006), and r-FrMS (Shin, Mun, Lee, & Jung, 2009b). The agent-based manufacturing systems fit naturally into a decentralized control structure, whereby they have a flexible and reconfigurable organizational structure (Weiss, 1999). Based on such a principle, all these systems continuously adapt their organizational structure to the environment by means of self-organizing mechanisms toward achieving a goal. That is, goal-orientation is a main organizing rule. In the area of MAS, researchers have been interested in reinforcement learning approaches to the problem of how an agent learns to select proper actions for achieving its goals through interacting with its environment (Wang & Usher, 2007). There have been several examples dealing with dynamic order acceptance (Arredondo & Martinez, 2010), production control (Csáji, Monostori, & Kádár, 2006), production scheduling (Wang and Usher, 2004, Wang and Usher, 2007 and Zhang et al., 2007), and agent architecture (Tan, Ong, & Tapanuj, 2011). All these examples have shown successful approaches to goal-orientation, assuming well-defined goals. Every production resource represented as a resource agent, however, is required to regulate its own goal, not only to adapt to the changing environment but also to conform to its cooperators and competitors. An action oriented toward a fixed goal inappropriate to changed environmental situation results in an adverse effect on overall performance. Despite the various successful approaches to goal-orientation, the regulation mechanism of a predefined goal has not been fully explored. In a manufacturing system, an aiming level with respect to various criteria (e.g. returned profit, utilization of resources, and processing lead time) can be considered as a goal. For example, if the aiming level of returned profit is too high, the resource agent tries to undertake high profit tasks only whereas it ignores relatively low profit tasks. Thus, the agent tends to miss the opportunity to undertake many tasks, since the high profit tasks might be assigned to other agents which have much competitiveness (e.g. low cost, short processing time, and high quality). Consequently, not only utilization rate but also returned profit becomes low. In this case, the agent should lower the aiming level of returned profit. In other words, the resource agent should regulate its own goal so as to conform to its environmental status (involving competitive or collaborative features of the entire environment and the interrelated features with competitors). This paper proposes an autonomous goal-regulation mechanism that adopts a reinforcement learning approach, aiming at implementation of self-evolutionary manufacturing system. The proposed regulation mechanism facilitates adaptation of a predefined goal to the changing environment. Individual goals are dynamically changed by a neural network-based fuzzy inference system, and the neural network is updated by a reinforcement signal from the environment. The remainder of this paper is organized as follows. Section 2 presents previous researches on self-evolutionary manufacturing systems, and discusses a reinforcement learning approach based on actor-critic learning. Section 3 is devoted to details of the proposed regulation mechanism. In Section 4, a case study on production planning is presented. Finally, the conclusions are drawn in Section 5.
نتیجه گیری انگلیسی
This paper has proposed a goal-regulation mechanism that adopts a reinforcement learning approach on the basis of actor-critic learning. The proposed mechanism is devoted to autonomous regulation of a predefined goal itself, whereas most of the reinforcement learning approaches focus on learning to select proper actions for achieving a predefined goal. Every decision entity, namely, autonomous and intelligent resource (AIR) unit, is equipped with the following two neural networks: goal evaluation network (GEN) and goal regulation network (GRN). The GEN is a critic estimator which approximates the compatibility of the current goal by using reinforcement signals from the environment. The GRN implements a fuzzy inference system that encapsulates the goal-regulation rules, and it is continuously updated by an internal reinforcement signal delivered from the GEN. The proposed mechanism was simulated in a production planning problem with exemplary fuzzy rules. As is clearly shown in the simulation result, the proposed mechanism assists individual AIR-units to play a decision entity role conforming to its dynamic environment. Actually, the AIR-units are enabled to facilitate continuous and autonomous adaptation of their own goals to the changing environment. In the simulation study, although the GA-based approach had a narrow lead over the proposed approach from the perspective of cost effectiveness, a centralized problem solving approach such as the GA-based approach is not suitable for current distributed and decentralized production environments. Actually, there cannot be a central decision entity authorized to access all the necessary information to formulate and solve a problem in such a complex environment. Therefore, distributed or decentralized approaches have been required. The proposed goal-regulation mechanism is a promising approach to making an efficient collaboration between distributed and decentralized production resources. For further research, we will endeavor to enable individual AIR-units to learn and become more reliable regulators. The current system assumes predefined regulating rules that are actually deterministic linguistic labels involved in the rules; thus, AIR-units may take the same regulating actions corresponding to the distinct environment. Therefore, it is necessary to tune the fuzzy logic controller according to the environment encountered. In another aspect of further research, the fuzzy goal model and its constituent terms will be examined more extensively to explore the possibility of incorporating them into practical industrial situations, as well as the possibility of incorporating the formulas representing various objective terms of production resources into the fuzzy goal model.