Graph Exploration for Effective Multiagent Q-Learning

被引:0
|
作者
Zhaikhan, Ainur [1 ]
Sayed, Ali H. [1 ]
机构
[1] Ecole Polytech Fed Lausanne EPFL, Adapt Syst Lab, CH-1015 Lausanne, Switzerland
关键词
Continuous state space; exploration; multiagent reinforcement learning (MARL); parallel Markov decision process (MDP);
D O I
10.1109/TNNLS.2024.3382480
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article proposes an exploration technique for multiagent reinforcement learning (MARL) with graph-based communication among agents. We assume that the individual rewards received by the agents are independent of the actions by the other agents, while their policies are coupled. In the proposed framework, neighboring agents collaborate to estimate the uncertainty about the state-action space in order to execute more efficient explorative behavior. Different from existing works, the proposed algorithm does not require counting mechanisms and can be applied to continuous-state environments without requiring complex conversion techniques. Moreover, the proposed scheme allows agents to communicate in a fully decentralized manner with minimal information exchange. And for continuous-state scenarios, each agent needs to exchange only a single parameter vector. The performance of the algorithm is verified with theoretical results for discrete-state scenarios and with experiments for the continuous ones.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Multiagent coordination utilising Q-learning
    Patnaik, Srikanta
    Mahalik, N. P.
    [J]. INTERNATIONAL JOURNAL OF AUTOMATION AND CONTROL, 2007, 1 (04) : 361 - 379
  • [2] Multiagent Q-learning based UAV trajectory planning for effective situational awareness
    Akin, Erdal
    Demir, Kubilay
    Yetgin, Halil
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2561 - 2579
  • [3] A,Multiagent approach to Q-learning for daily stock trading
    Lee, Jae Won
    Park, Jonghun
    O, Jangmin
    Lee, Jongwoo
    Hong, Euyseok
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2007, 37 (06): : 864 - 877
  • [4] Multiagent Q-learning with Sub-Team Coordination
    Huang, Wenhan
    Li, Kai
    Shao, Kun
    Zhou, Tianze
    Taylor, Matthew E.
    Luo, Jun
    Wang, Dongge
    Mao, Hangyu
    Hao, Jianye
    Wang, Jun
    Deng, Xiaotie
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] SEM: Safe exploration mask for q-learning
    Xuan, Chengbin
    Zhang, Feng
    Lam, Hak-Keung
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 111
  • [6] A Formal Model for Multiagent Q-Learning Dynamics on Regular Graphs
    Chu, Chen
    Li, Yong
    Liu, Jinzhuo
    Hu, Shuyue
    Li, Xuelong
    Wang, Zhen
    [J]. PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 194 - 200
  • [7] Training Multiagent Systems by Q-Learning: Approaches and Empirical Results
    Manuel Lopez-Guede, Jose
    Fernandez-Gauna, Borja
    Grana, Manuel
    Zulueta, Ekaitz
    [J]. COMPUTATIONAL INTELLIGENCE, 2015, 31 (03) : 498 - 512
  • [8] EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents
    Zhang, Zhen
    Wang, Dongqing
    [J]. COMPLEXITY, 2018,
  • [9] Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid
    Yang, Yaodong
    Hao, Jianye
    Sun, Mingyang
    Wang, Zan
    Fan, Changjie
    Strbac, Goran
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 569 - 575
  • [10] Convergence of Multiagent Q-learning: Multi Action Replay Process Approach
    Kim, Han-Eol
    Ahn, Hyo-Sung
    [J]. 2010 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL, 2010, : 789 - 794