Exploiting locality of interactions using a policy-gradient approach in multiagent learning

被引:1
|
作者
Melo, Francisco S. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
来源
ECAI 2008, PROCEEDINGS | 2008年 / 178卷
基金
美国安德鲁·梅隆基金会;
关键词
D O I
10.3233/978-1-58603-891-5-157
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality of interaction observed in many practical problems. Our algorithms can be described by an actor-critic architecture: the actor component combines natural gradient updates with a varying learning rate; the critic uses only local information to maintain a belief over the joint state-space, and evaluates the current policy as a function of this belief using compatible function approximation. In order to speed the convergence of the algorithm, we use an optimistic initialization of the policy that relies on a fully observable, single agent model of the problem. We illustrate our approach in some simple application problems.
引用
收藏
页码:157 / +
页数:2
相关论文
共 50 条
  • [1] Active structural control framework using policy-gradient reinforcement learning
    Eshkevari, Soheila Sadeghi
    Eshkevari, Soheil Sadeghi
    Sen, Debarshi
    Pakzad, Shamim N.
    [J]. ENGINEERING STRUCTURES, 2022, 274
  • [2] A model of cell specialization using a Hebbian policy-gradient approach with "slow" noise
    Emmanuel Daucé
    [J]. BMC Neuroscience, 10 (Suppl 1)
  • [3] A Model of Neuronal Specialization Using Hebbian Policy-Gradient with "Slow" Noise
    Dauce, Emmanuel
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2009, PT I, 2009, 5768 : 218 - 228
  • [4] Traffic light control using deep policy-gradient and value-function-based reinforcement learning
    Mousavi, Seyed Sajad
    Schukat, Michael
    Howley, Enda
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2017, 11 (07) : 417 - 423
  • [5] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Using communication to reduce locality in distributed multiagent learning
    Mataric, MJ
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1998, 10 (03) : 357 - 369
  • [7] Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning
    Graf, Tobias
    Platzner, Marco
    [J]. ADVANCES IN COMPUTER GAMES, ACG 2015, 2015, 9525 : 1 - 11
  • [8] Efficient Microservice Deployment in the Edge-Cloud Networks with Policy-Gradient Reinforcement Learning
    Afachao, Kevin
    Abu-Mahfouz, Adnan M.
    Hanke, Gerhard P.
    [J]. IEEE Access, 2024, 12 : 133110 - 133124
  • [9] Democratic Population Decisions Result in Robust Policy-Gradient Learning: A Parametric Study with GPU Simulations
    Richmond, Paul
    Buesing, Lars
    Giugliano, Michele
    Vasilaki, Eleni
    [J]. PLOS ONE, 2011, 6 (05):
  • [10] A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential
    Zhang, Zhen
    Ong, Yew-Soon
    Wang, Dongqing
    Xue, Binqiang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1015 - 1027