یادگیری Q چند هدفه ی تیم های همکاری
|تعداد صفحات مقاله انگلیسی
|10 صفحه PDF
نسخه انگلیسی مقاله همین الان قابل دانلود است.
هزینه ترجمه مقاله بر اساس تعداد کلمات مقاله انگلیسی محاسبه می شود.
این مقاله تقریباً شامل 5346 کلمه می باشد.
هزینه ترجمه مقاله توسط مترجمان با تجربه، طبق جدول زیر محاسبه می شود:
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 38, Issue 3, March 2011, Pages 1565–1574
This paper studies a multi-goal Q-learning algorithm of cooperative teams. Member of the cooperative teams is simulated by an agent. In the virtual cooperative team, agents adapt its knowledge according to cooperative principles. The multi-goal Q-learning algorithm is approached to the multiple learning goals. In the virtual team, agents learn what knowledge to adopt and how much to learn (choosing learning radius). The learning radius is interpreted in Section 3.1. Five basic experiments are manipulated proving the validity of the multi-goal Q-learning algorithm. It is found that the learning algorithm causes agents to converge to optimal actions, based on agents’ continually updated cognitive maps of how actions influence learning goals. It is also proved that the learning algorithm is beneficial to the multiple goals. Furthermore, the paper analyzes how sensitive the learning performance is affected by the parameter values of the learning algorithm.
In cooperative teams, team members adopt knowledge to improve their ability and teams’ performances. They have more than one learning goal in cooperative teams. In this paper, team members’ learning goals consist of the size of the team, the performance of the team and individuals. A multi-agent system is used to simulate cooperative teams. The model of virtual cooperative team is based on Gilbert and Ahrweiler’s research (Ahrweiler et al., 2004, Gilbert et al., 2007 and Gilbert et al., 2001). In their research, the “KENE” was used to describe the knowledge of members. Suppliers and customers were generated by the computation of the KENE. Based on the research, this paper proposes a virtual cooperative team for the experiments skeleton of the learning algorithm. Details of the virtual cooperative team are proposed in Section 3.1. To solve the multi-goal problem, Gadanho (2003) presented the ALEC agent architecture which has both emotive and cognitive decision-making capabilities to adapt the multi-goal survival task. Gadanho’s research was beneficial to deal with the multi-goal task (the goals may conflict with each other). An improved reinforcement learning algorithm was proposed to learn multi-goal dialogue strategies (Cuayáhuitl, 2006). Zhou and Coggins (2004) presented an emotion-based hierarchical reinforcement learning (HRL) algorithm for environments with multiple goals of reward. The multi-goal Q-learning algorithm is proposed to improve the multi-goal learning ability of the agents (the virtual team members). The tendency of agents for exploring unknown actions is discussed in the learning algorithm. Agents with the learning algorithm can decide what knowledge to adopt and how much to learn (choosing learning radius) by themselves for multiple goals. Experimental results show that the multiple goals can be achieved by agents with the learning algorithm. Moreover, two sets of sensitivity experiments are conducted in the paper.
نتیجه گیری انگلیسی
In order to improve the agent’s multi-goal learning ability, this paper presents a multi-goal Q-learning algorithm in cooperative teams. First, the paper proposed a multi-agent cooperative team for the experiments of the learning algorithm. Second, the multi-goal Q-learning algorithm is modeled to solve the multiple goal learning problems in the virtual team. The algorithm gives the ability for the agents to adjust his learning radius from the observation of his experiences. With the knowledge learned from others, the agents are able to achieve the multiple goals. The five experiments illustrated that the multi-goal Q-learning algorithm can be used as an effective learning approach in cooperative teams. Furthermore, the sensitivity analyses of two important learning parameters are conducted in Section 5. The work presented here can be extended along several directions. For example, more complex learning mechanisms can be used to improve the agents’ learning ability to accomplish more complex problems. Such extensions of learning ability may be able to provide better mechanisms to resolve the conflicts among cooperative team members. These are key issues for the research of cooperative teams.