Exploiting locality of interactions using a policy-gradient approach in multiagent learning

被引:1
|
作者
Melo, Francisco S. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
来源
ECAI 2008, PROCEEDINGS | 2008年 / 178卷
基金
美国安德鲁·梅隆基金会;
关键词
D O I
10.3233/978-1-58603-891-5-157
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality of interaction observed in many practical problems. Our algorithms can be described by an actor-critic architecture: the actor component combines natural gradient updates with a varying learning rate; the critic uses only local information to maintain a belief over the joint state-space, and evaluates the current policy as a function of this belief using compatible function approximation. In order to speed the convergence of the algorithm, we use an optimistic initialization of the policy that relies on a fully observable, single agent model of the problem. We illustrate our approach in some simple application problems.
引用
收藏
页码:157 / +
页数:2
相关论文
共 50 条
  • [41] An adaptative multiagent environment supported by the socio-constructivist approach using learning objects
    Pinheiro, R
    Furtado, E
    Loureiro, R
    [J]. Innovations Through Information Technology, Vols 1 and 2, 2004, : 1213 - 1215
  • [42] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
    Yu, Shangyu
    Wang, Lei
    Li, Bo
    He, Huacheng
    [J]. Journal of Marine Science and Technology (Japan), 2020, 25 (03): : 757 - 768
  • [43] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
    Yu, Shangyu
    Wang, Lei
    Li, Bo
    He, Huacheng
    [J]. JOURNAL OF MARINE SCIENCE AND TECHNOLOGY, 2020, 25 (03) : 757 - 768
  • [44] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
    Shangyu Yu
    Lei Wang
    Bo Li
    Huacheng He
    [J]. Journal of Marine Science and Technology, 2020, 25 : 757 - 768
  • [45] Learning of Soccer Player Agents Using a Policy Gradient Method: Pass Selection
    Igarashi, Harukazu
    Fukuoka, Hitoshi
    Ishihara, Seiji
    [J]. INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 31 - +
  • [46] Continuous Parameter Control in Genetic Algorithms using Policy Gradient Reinforcement Learning
    de Miguel Gomez, Alejandro
    Toosi, Farshad Ghassemi
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 115 - 122
  • [47] Learning a Self-driving Bicycle Using Deep Deterministic Policy Gradient
    Le, Tuyen P.
    Quang, Nguyen Dang
    Choi, SeungYoon
    Chung, TaeChoong
    [J]. 2018 18TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2018, : 231 - 236
  • [48] Data-Driven Optimal Bipartite Consensus Control for Second-Order Multiagent Systems via Policy Gradient Reinforcement Learning
    Liu, Qiwei
    Yan, Huaicheng
    Wang, Meng
    Li, Zhichen
    Liu, Shuai
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (06) : 3468 - 3478
  • [49] Accelerating the Computation of Solutions in Resource Allocation Problems Using an Evolutionary Approach and Multiagent Reinforcement Learning
    Bazzan, Ana L. C.
    [J]. APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2018, 2018, 10784 : 185 - 201
  • [50] Business Game-Based experimental Active Learning Using a Multiagent Approach for Management Education
    Hishiyama, Reiko
    Nakajima, Yuu
    [J]. 3RD INTERNATIONAL CONFERENCE ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY (ACIT 2015) 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND INTELLIGENCE (CSI 2015), 2015, : 254 - 259