Applying the policy gradient method to behavior learning in multiagent systems: The pursuit problem

被引:0
|
作者
Ishihara, Seiji [1 ]
Igarashi, Harukazu [1 ]
机构
[1] School of Engineering, Kinki University, Higashi-Hiroshima, 739-2116, Japan
来源
Systems and Computers in Japan | 2006年 / 37卷 / 10期
关键词
In the field of multiagent systems; some methods use the policy gradient method for behavior learning. In these methods; the learning problem in the multiagent system is reduced to each agent's independent learning problem by adopting an autonomous distributed behavior determination method. That is; a probabilistic policy that contains parameters is used as the policy of each agent; and the parameters are updated while calculating the maximum gradient so as to maximize the expectation value of the reward. In this paper; first; recognizing the action determination problem at each time step to be a minimization problem for some objective function; the Boltzmann distribution; in which this objective function is the energy function; was adopted as the probabilistic policy. Next; we showed that this objective function can be expressed by such terms as the value of the state; the state action rule; and the potential. Further; as a result of an experiment applying this method to a pursuit problem; good policy was obtained and this method was found to be flexible so that it can be adapted to use of heuristics and to modification of behavioral constraint and objective in the policy. © 2006 Wiley Periodicals; Inc;
D O I
暂无
中图分类号
学科分类号
摘要
Journal article (JA)
引用
收藏
页码:101 / 109
相关论文
共 50 条
  • [1] Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games
    Zhou, Shiyang
    Ren, Weiya
    Ren, Xiaoguang
    Wang, Yanzhen
    Yi, Xiaodong
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 625 - 637
  • [2] Policy gradient methods in multi-agent systems - Pursuit problem
    Ishihara, S
    Igarashi, H
    [J]. DESIGN AND APPLICATION OF HYBRID INTELLIGENT SYSTEMS, 2003, 104 : 789 - 798
  • [3] A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential
    Zhang, Zhen
    Ong, Yew-Soon
    Wang, Dongqing
    Xue, Binqiang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1015 - 1027
  • [4] On the Dual Gradient Descent Method for the Resource Allocation Problem in Multiagent Systems
    D. B. Rokhlin
    [J]. Journal of Applied and Industrial Mathematics, 2024, 18 (2) : 316 - 332
  • [5] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Learning Policy Representations in Multiagent Systems
    Grover, Aditya
    Al-Shedivat, Maruan
    Gupta, Jayesh K.
    Burda, Yura
    Edwards, Harrison
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning
    Ishiwaka, Y
    Sato, T
    Kakazu, Y
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2003, 43 (04) : 245 - 256
  • [8] Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning
    Yang, Xindi
    Zhang, Hao
    Wang, Zhuping
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3872 - 3883
  • [9] Gradient based method for symmetric and asymmetric multiagent reinforcement learning
    Könönen, V
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 68 - 75
  • [10] Exploiting locality of interactions using a policy-gradient approach in multiagent learning
    Melo, Francisco S.
    [J]. ECAI 2008, PROCEEDINGS, 2008, 178 : 157 - +