Exploiting locality of interactions using a policy-gradient approach in multiagent learning

被引：1

作者：

Melo, Francisco S. ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA

来源：

ECAI 2008, PROCEEDINGS | 2008年 / 178卷

基金：

美国安德鲁·梅隆基金会;

关键词：

D O I：

10.3233/978-1-58603-891-5-157

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality of interaction observed in many practical problems. Our algorithms can be described by an actor-critic architecture: the actor component combines natural gradient updates with a varying learning rate; the critic uses only local information to maintain a belief over the joint state-space, and evaluates the current policy as a function of this belief using compatible function approximation. In order to speed the convergence of the algorithm, we use an optimistic initialization of the policy that relies on a fully observable, single agent model of the problem. We illustrate our approach in some simple application problems.

引用

页码：157 / +

页数：2

共 50 条

[41] An adaptative multiagent environment supported by the socio-constructivist approach using learning objects
Pinheiro, R
Furtado, E
Loureiro, R
[J]. Innovations Through Information Technology, Vols 1 and 2, 2004, : 1213 - 1215
[42] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
Yu, Shangyu
Wang, Lei
Li, Bo
He, Huacheng
[J]. Journal of Marine Science and Technology (Japan), 2020, 25 (03): : 757 - 768
[43] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
Yu, Shangyu
Wang, Lei
Li, Bo
He, Huacheng
[J]. JOURNAL OF MARINE SCIENCE AND TECHNOLOGY, 2020, 25 (03) : 757 - 768
[44] Optimal setpoint learning of a thruster-assisted position mooring system using a deep deterministic policy gradient approach
Shangyu Yu
Lei Wang
Bo Li
Huacheng He
[J]. Journal of Marine Science and Technology, 2020, 25 : 757 - 768
[45] Learning of Soccer Player Agents Using a Policy Gradient Method: Pass Selection
Igarashi, Harukazu
Fukuoka, Hitoshi
Ishihara, Seiji
[J]. INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 31 - +
[46] Continuous Parameter Control in Genetic Algorithms using Policy Gradient Reinforcement Learning
de Miguel Gomez, Alejandro
Toosi, Farshad Ghassemi
[J]. PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 115 - 122
[47] Learning a Self-driving Bicycle Using Deep Deterministic Policy Gradient
Le, Tuyen P.
Quang, Nguyen Dang
Choi, SeungYoon
Chung, TaeChoong
[J]. 2018 18TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2018, : 231 - 236
[48] Data-Driven Optimal Bipartite Consensus Control for Second-Order Multiagent Systems via Policy Gradient Reinforcement Learning
Liu, Qiwei
Yan, Huaicheng
Wang, Meng
Li, Zhichen
Liu, Shuai
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (06) : 3468 - 3478
[49] Accelerating the Computation of Solutions in Resource Allocation Problems Using an Evolutionary Approach and Multiagent Reinforcement Learning
Bazzan, Ana L. C.
[J]. APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2018, 2018, 10784 : 185 - 201
[50] Business Game-Based experimental Active Learning Using a Multiagent Approach for Management Education
Hishiyama, Reiko
Nakajima, Yuu
[J]. 3RD INTERNATIONAL CONFERENCE ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY (ACIT 2015) 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND INTELLIGENCE (CSI 2015), 2015, : 254 - 259

← 1 2 3 4 5 →