Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning

被引:0
|
作者
Hu, Jifeng [1 ]
Sun, Yanchao [2 ]
Chen, Hechang [1 ]
Huang, Sili [1 ]
Piao, Haiyin [3 ]
Chang, Yi [1 ]
Sun, Lichao [4 ]
机构
[1] Jlilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Northwestern Polytech Univ, Xian, Peoples R China
[4] Lehigh Univ, Bethlehem, PA USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning has drawn increasing attention in practice, e.g., robotics and automatic driving, as it can explore optimal policies using samples generated by interacting with the environment. However, high reward uncertainty still remains a problem when we want to train a satisfactory model, because obtaining high-quality reward feedback is usually expensive and even infeasible. To handle this issue, previous methods mainly focus on passive reward correction. At the same time, recent active reward estimation methods have proven to be a recipe for reducing the effect of reward uncertainty. In this paper, we propose a novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL). Our main idea is to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training. Specifically, we design the multi-action-branch reward estimation to model reward distributions on all action branches. Then we utilize reward aggregation to obtain stable updating signals during training. Our intuition is that consideration of all possible consequences of actions could be useful for learning policies. The superiority of the DRE-MARL is demonstrated using benchmark multi-agent scenarios, compared with the SOTA baselines in terms of both effectiveness and robustness.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Multi-Agent Deep Reinforcement Learning with Emergent Communication
    Simoes, David
    Lau, Nuno
    Reis, Luis Paulo
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [32] Experience Selection in Multi-Agent Deep Reinforcement Learning
    Wang, Yishen
    Zhang, Zongzhang
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 864 - 870
  • [33] Sparse communication in multi-agent deep reinforcement learning
    Han, Shuai
    Dastani, Mehdi
    Wang, Shihan
    NEUROCOMPUTING, 2025, 625
  • [34] Multi-Agent Deep Reinforcement Learning with Human Strategies
    Thanh Nguyen
    Ngoc Duy Nguyen
    Nahavandi, Saeid
    2019 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2019, : 1357 - 1362
  • [35] Competitive Evolution Multi-Agent Deep Reinforcement Learning
    Zhou, Wenhong
    Chen, Yiting
    Li, Jie
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [36] Strategic Interaction Multi-Agent Deep Reinforcement Learning
    Zhou, Wenhong
    Li, Jie
    Chen, Yiting
    Shen, Lin-Cheng
    IEEE ACCESS, 2020, 8 : 119000 - 119009
  • [37] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
    Liu, Iou-Jen
    Jain, Unnat
    Yeh, Raymond A.
    Schwing, Alexander G.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [38] A review of cooperative multi-agent deep reinforcement learning
    Oroojlooy, Afshin
    Hajinezhad, Davood
    APPLIED INTELLIGENCE, 2023, 53 (11) : 13677 - 13722
  • [39] Multi-Agent Deep Reinforcement Learning for Walker Systems
    Park, Inhee
    Moh, Teng-Sheng
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 490 - 495
  • [40] Action Markets in Deep Multi-Agent Reinforcement Learning
    Schmid, Kyrill
    Belzner, Lenz
    Gabor, Thomas
    Phan, Thomy
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT II, 2018, 11140 : 240 - 249