Optimistic sequential multi-agent reinforcement learning with motivational communication

被引:0
|
作者
Huang, Anqi [1 ]
Wang, Yongli [1 ]
Zhou, Xiaoliang [1 ]
Zou, Haochen [1 ]
Dong, Xu [1 ]
Che, Xun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-agent reinforcement learning; Policy gradient; Motivational communication; Reinforcement learning; Multi-agent system;
D O I
10.1016/j.neunet.2024.106547
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Centralized Training with Decentralized Execution (CTDE) is a prevalent paradigm in the field of fully cooperative Multi-Agent Reinforcement Learning (MARL). Existing algorithms often encounter two major problems: independent strategies tend to underestimate the potential value of actions, leading to the convergence on sub-optimal Nash Equilibria (NE); some communication paradigms introduce added complexity to the learning process, complicating the focus on the essential elements of the messages. To address these challenges, we propose a novel method called O ptimistic S equential S oft Actor Critic with M otivational C ommunication (OSSMC). The key idea of OSSMC is to utilize a greedy-driven approach to explore the potential value of individual policies, named optimistic Q-values, which serve as an upper bound for the Q-value of the current policy. We then integrate a sequential update mechanism with optimistic Q-value for agents, aiming to ensure monotonic improvement in the joint policy optimization process. Moreover, we establish motivational communication modules for each agent to disseminate motivational messages to promote cooperative behaviors. Finally, we employ a value regularization strategy from the Soft Actor Critic (SAC) method to maximize entropy and improve exploration capabilities. The performance of OSSMC was rigorously evaluated against a series of challenging benchmark sets. Empirical results demonstrate that OSSMC not only surpasses current baseline algorithms but also exhibits a more rapid convergence rate.
引用
下载
收藏
页数:12
相关论文
共 50 条
  • [21] Cooperative Behavior by Multi-agent Reinforcement Learning with Abstractive Communication
    Tanda, Jin
    Moustafa, Ahmed
    Ito, Takayuki
    2019 IEEE INTERNATIONAL CONFERENCE ON AGENTS (ICA), 2019, : 8 - 13
  • [22] Multi-agent Pathfinding with Communication Reinforcement Learning and Deadlock Detection
    Ye, Zhaohui
    Li, Yanjie
    Guo, Ronghao
    Gao, Jianqi
    Fu, Wen
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT I, 2022, 13455 : 493 - 504
  • [23] Cooperative Multi-agent Reinforcement Learning with Hierachical Communication Architecture
    Liu, Shifan
    Yuan, Quan
    Chen, Bo
    Luo, Guiyang
    Li, Jinglin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 14 - 25
  • [24] Deep Hierarchical Communication Graph in Multi-Agent Reinforcement Learning
    Liu, Zeyang
    Wan, Lipeng
    Sui, Xue
    Chen, Zhuoran
    Sun, Kewu
    Lan, Xuguang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 208 - 216
  • [25] Semantic Communication for Partial Observation Multi-agent Reinforcement Learning
    Do, Hoang Khoi
    Dinh, Thi Quynh
    Nguyen, Minh Duong
    Nguyen, Tien Hoa
    2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP, 2023, : 319 - 323
  • [26] Communication-Efficient and Federated Multi-Agent Reinforcement Learning
    Krouka, Mounssif
    Elgabli, Anis
    Ben Issaid, Chaouki
    Bennis, Mehdi
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (01) : 311 - 320
  • [27] A sequential multi-agent reinforcement learning framework for different action spaces
    Tian, Shucong
    Yang, Meng
    Xiong, Rongling
    He, Xingxing
    Rajasegarar, Sutharshan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [28] DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning
    Yuan, Tingting
    Chung, Hwei-Ming
    Yuan, Jie
    Fu, Xiaoming
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11763 - 11771
  • [29] Team-wise effective communication in multi-agent reinforcement learning
    Yang, Ming
    Zhao, Kaiyan
    Wang, Yiming
    Dong, Renzhi
    Du, Yali
    Liu, Furui
    Zhou, Mingliang
    Hou, U. Leong
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (02)
  • [30] Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication
    Lidard, Justin
    Madhushani, Udari
    Leonard, Naomi Ehrich
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3311 - 3316