An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning

被引:49
|
作者
Ishiwaka, Y
Sato, T
Kakazu, Y
机构
[1] Hakodate Natl Coll Technol, Hakodate, Hokkaido, Japan
[2] Future Univ Hakodate, Hakodate, Hokkaido, Japan
[3] Hokkaido Univ, Sapporo, Hokkaido, Japan
关键词
pursuit problem; prediction; Q-learning; emergence; heterogeneous multiagent system;
D O I
10.1016/S0921-8890(03)00040-X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cooperation among agents is important for multiagent systems having a shared goal. In this paper, an example of the pursuit problem is studied, in which four hunters collaborate to catch a target. A reinforcement learning algorithm is employed to model how the hunters acquire this cooperative behavior to achieve the task. In order to apply Q-learning, which is one way of reinforcement learning, two kinds of prediction are needed for each hunter agent. One is the location of the other hunter agents and target agent, and the other is the movement direction of the target agent at next time step t. In our treatment we extend the standard problem to systems with heterogeneous agents. One motivation for this is that the target agent and hunter agents have differing abilities. In addition, even though those hunter agents are homogeneous at the beginning of the problem, their abilities become heterogeneous in the learning process. Simulations of this pursuit problem were performed on a continuous action state space, the results of which are displayed, accompanied by a discussion of their outcomes' dependence upon the initial locations of the hunters and the speeds of the hunters and a target. (C) 2003 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:245 / 256
页数:12
相关论文
共 50 条
  • [21] An intelligent offloading system based on multiagent reinforcement learning
    Weng, Yu
    Chu, Haozhen
    Shi, Zhaoyi
    Security and Communication Networks, 2021, 2021
  • [22] Multiagent reinforcement learning for a planetary exploration multirobot system
    Zhang Zheng
    Ma Shu-gen
    Cao Bing-gang
    Zhang Li-ping
    Li Bin
    AGENT COMPUTING AND MULTI-AGENT SYSTEMS, 2006, 4088 : 339 - 350
  • [23] A Multitier Reinforcement Learning Model for a Cooperative Multiagent System
    Shi, Haobin
    Zhai, Liangjing
    Wu, Haibo
    Hwang, Maxwell
    Hwang, Kao-Shing
    Hsu, Hsuan-Pei
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (03) : 636 - 644
  • [24] An Intelligent Offloading System Based on Multiagent Reinforcement Learning
    Weng, Yu
    Chu, Haozhen
    Shi, Zhaoyi
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [25] An Approach to Multi-Agent Pursuit Evasion Games Using Reinforcement Learning
    Bilgin, Ahmet Tunc
    Kadioglu-Urtis, Esra
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), 2015, : 164 - 169
  • [26] Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games
    Zhou, Shiyang
    Ren, Weiya
    Ren, Xiaoguang
    Wang, Yanzhen
    Yi, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 625 - 637
  • [27] Dynamic Leader-Follower Output Containment Control of Heterogeneous Multiagent Systems Using Reinforcement Learning
    Zhang, Huaipin
    Zhao, Wei
    Xie, Xiangpeng
    Yue, Dong
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (09): : 5307 - 5316
  • [28] Coordination for Multienergy Microgrids Using Multiagent Reinforcement Learning
    Qiu, Dawei
    Chen, Tianyi
    Strbac, Goran
    Bu, Shengrong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) : 5689 - 5700
  • [29] Applying the policy gradient method to behavior learning in multiagent systems: The pursuit problem
    Ishihara, Seiji
    Igarashi, Harukazu
    Systems and Computers in Japan, 2006, 37 (10): : 101 - 109
  • [30] Cooperative learning behaviour strategy in heterogeneous multiagent system
    Zhuang, YB
    Yang, JG
    ICAI '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, 2005, : 483 - 490