Exploiting locality of interactions using a policy-gradient approach in multiagent learning

被引：1

作者：

Melo, Francisco S. ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA

来源：

ECAI 2008, PROCEEDINGS | 2008年 / 178卷

基金：

美国安德鲁·梅隆基金会;

关键词：

D O I：

10.3233/978-1-58603-891-5-157

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality of interaction observed in many practical problems. Our algorithms can be described by an actor-critic architecture: the actor component combines natural gradient updates with a varying learning rate; the critic uses only local information to maintain a belief over the joint state-space, and evaluates the current policy as a function of this belief using compatible function approximation. In order to speed the convergence of the algorithm, we use an optimistic initialization of the policy that relies on a fully observable, single agent model of the problem. We illustrate our approach in some simple application problems.

引用

页码：157 / +

页数：2

共 50 条

[1] Active structural control framework using policy-gradient reinforcement learning
Eshkevari, Soheila Sadeghi
Eshkevari, Soheil Sadeghi
Sen, Debarshi
Pakzad, Shamim N.
[J]. ENGINEERING STRUCTURES, 2022, 274
[2] A model of cell specialization using a Hebbian policy-gradient approach with "slow" noise
Emmanuel Daucé
[J]. BMC Neuroscience, 10 (Suppl 1)
[3] A Model of Neuronal Specialization Using Hebbian Policy-Gradient with "Slow" Noise
Dauce, Emmanuel
[J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2009, PT I, 2009, 5768 : 218 - 228
[4] Traffic light control using deep policy-gradient and value-function-based reinforcement learning
Mousavi, Seyed Sajad
Schukat, Michael
Howley, Enda
[J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2017, 11 (07) : 417 - 423
[5] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Kim, Dong-Ki
Liu, Miao
Riemer, Matthew
Sun, Chuangchuang
Abdulhai, Marwa
Habibi, Golnaz
Lopez-Cot, Sebastian
Tesauro, Gerald
How, Jonathan P.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[6] Using communication to reduce locality in distributed multiagent learning
Mataric, MJ
[J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1998, 10 (03) : 357 - 369
[7] Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning
Graf, Tobias
Platzner, Marco
[J]. ADVANCES IN COMPUTER GAMES, ACG 2015, 2015, 9525 : 1 - 11
[8] Efficient Microservice Deployment in the Edge-Cloud Networks with Policy-Gradient Reinforcement Learning
Afachao, Kevin
Abu-Mahfouz, Adnan M.
Hanke, Gerhard P.
[J]. IEEE Access, 2024, 12 : 133110 - 133124
[9] Democratic Population Decisions Result in Robust Policy-Gradient Learning: A Parametric Study with GPU Simulations
Richmond, Paul
Buesing, Lars
Giugliano, Michele
Vasilaki, Eleni
[J]. PLOS ONE, 2011, 6 (05):
[10] A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential
Zhang, Zhen
Ong, Yew-Soon
Wang, Dongqing
Xue, Binqiang
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1015 - 1027

← 1 2 3 4 5 →