Graph Exploration for Effective Multiagent Q-Learning

被引：0

作者：

Zhaikhan, Ainur ^{[1
]}

Sayed, Ali H. ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne EPFL, Adapt Syst Lab, CH-1015 Lausanne, Switzerland

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

关键词：

Continuous state space; exploration; multiagent reinforcement learning (MARL); parallel Markov decision process (MDP);

D O I：

10.1109/TNNLS.2024.3382480

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article proposes an exploration technique for multiagent reinforcement learning (MARL) with graph-based communication among agents. We assume that the individual rewards received by the agents are independent of the actions by the other agents, while their policies are coupled. In the proposed framework, neighboring agents collaborate to estimate the uncertainty about the state-action space in order to execute more efficient explorative behavior. Different from existing works, the proposed algorithm does not require counting mechanisms and can be applied to continuous-state environments without requiring complex conversion techniques. Moreover, the proposed scheme allows agents to communicate in a fully decentralized manner with minimal information exchange. And for continuous-state scenarios, each agent needs to exchange only a single parameter vector. The performance of the algorithm is verified with theoretical results for discrete-state scenarios and with experiments for the continuous ones.

引用

页数：12

共 50 条

[1] Multiagent coordination utilising Q-learning
Patnaik, Srikanta
Mahalik, N. P.
[J]. INTERNATIONAL JOURNAL OF AUTOMATION AND CONTROL, 2007, 1 (04) : 361 - 379
[2] Multiagent Q-learning based UAV trajectory planning for effective situational awareness
Akin, Erdal
Demir, Kubilay
Yetgin, Halil
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2561 - 2579
[3] A,Multiagent approach to Q-learning for daily stock trading
Lee, Jae Won
Park, Jonghun
O, Jangmin
Lee, Jongwoo
Hong, Euyseok
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2007, 37 (06): : 864 - 877
[4] Multiagent Q-learning with Sub-Team Coordination
Huang, Wenhan
Li, Kai
Shao, Kun
Zhou, Tianze
Taylor, Matthew E.
Luo, Jun
Wang, Dongge
Mao, Hangyu
Hao, Jianye
Wang, Jun
Deng, Xiaotie
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[5] SEM: Safe exploration mask for q-learning
Xuan, Chengbin
Zhang, Feng
Lam, Hak-Keung
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 111
[6] A Formal Model for Multiagent Q-Learning Dynamics on Regular Graphs
Chu, Chen
Li, Yong
Liu, Jinzhuo
Hu, Shuyue
Li, Xuelong
Wang, Zhen
[J]. PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 194 - 200
[7] Training Multiagent Systems by Q-Learning: Approaches and Empirical Results
Manuel Lopez-Guede, Jose
Fernandez-Gauna, Borja
Grana, Manuel
Zulueta, Ekaitz
[J]. COMPUTATIONAL INTELLIGENCE, 2015, 31 (03) : 498 - 512
[8] EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents
Zhang, Zhen
Wang, Dongqing
[J]. COMPLEXITY, 2018,
[9] Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid
Yang, Yaodong
Hao, Jianye
Sun, Mingyang
Wang, Zan
Fan, Changjie
Strbac, Goran
[J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 569 - 575
[10] Convergence of Multiagent Q-learning: Multi Action Replay Process Approach
Kim, Han-Eol
Ahn, Hyo-Sung
[J]. 2010 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL, 2010, : 789 - 794

← 1 2 3 4 5 →