Emergent Social Learning via Multi-agent Reinforcement Learning

被引:0
|
作者
Ndousse, Kamal [1 ]
Eck, Douglas [2 ]
Levine, Sergey [2 ,3 ]
Jaques, Natasha [2 ,3 ]
机构
[1] OpenAI, San Francisco, CA 94110 USA
[2] Google Res, Brain Team, Mountain View, CA USA
[3] Univ Calif Berkeley, Berkeley, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social learning is a key component of human and animal intelligence. By taking cues from the behavior of experts in their environment, social learners can acquire sophisticated behavior and rapidly adapt to new circumstances. This paper investigates whether independent reinforcement learning (RL) agents in a multi-agent environment can learn to use social learning to improve their performance. We find that in most circumstances, vanilla model-free RL agents do not use social learning. We analyze the reasons for this deficiency, and show that by imposing constraints on the training environment and introducing a model-based auxiliary loss we are able to obtain generalized social learning policies which enable agents to: i) discover complex skills that are not learned from single-agent training, and ii) adapt online to novel environments by taking cues from experts present in the new environment. In contrast, agents trained with model-free RL or imitation learning generalize poorly and do not succeed in the transfer tasks. By mixing multi-agent and solo training, we can obtain agents that use social learning to gain skills that they can deploy when alone, even out-performing agents trained alone from the start.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Learning structured communication for multi-agent reinforcement learning
    Sheng, Junjie
    Wang, Xiangfeng
    Jin, Bo
    Yan, Junchi
    Li, Wenhao
    Chang, Tsung-Hui
    Wang, Jun
    Zha, Hongyuan
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2022, 36 (02)
  • [32] Generalized learning automata for multi-agent reinforcement learning
    De Hauwere, Yann-Michael
    Vrancx, Peter
    Nowe, Ann
    AI COMMUNICATIONS, 2010, 23 (04) : 311 - 324
  • [33] Multi-Agent Cognition Difference Reinforcement Learning for Multi-Agent Cooperation
    Wang, Huimu
    Qiu, Tenghai
    Liu, Zhen
    Pu, Zhiqiang
    Yi, Jianqiang
    Yuan, Wanmai
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [34] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [35] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
    Xu, Chi
    Zhang, Hui
    Zhang, Ya
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
  • [36] Learning Heterogeneous Strategies via Graph-based Multi-agent Reinforcement Learning
    Li, Yang
    Luo, Xiangfeng
    Xie, Shaorong
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 709 - 713
  • [37] Learning adversarial policy in multiple scenes environment via multi-agent reinforcement learning
    Li, Yang
    Wang, Xinzhi
    Wang, Wei
    Zhang, Zhenyu
    Wang, Jianshu
    Luo, Xiangfeng
    Xie, Shaorong
    CONNECTION SCIENCE, 2021, 33 (03) : 407 - 426
  • [38] FedQMIX: Communication-efficient federated learning via multi-agent reinforcement learning
    Cao, Shaohua
    Zhang, Hanqing
    Wen, Tian
    Zhao, Hongwei
    Zheng, Quancheng
    Zhang, Weishan
    Zheng, Danyang
    HIGH-CONFIDENCE COMPUTING, 2024, 4 (02):
  • [39] Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study
    Martinez-Gil, Francisco
    Lozano, Miguel
    Fernandez, Fernando
    MULTI-AGENT-BASED SIMULATION XV, 2015, 9002 : 228 - 238
  • [40] Emergent behaviors and scalability for multi-agent reinforcement learning-based pedestrian models
    Martinez-Gil, Francisco
    Lozano, Miguel
    Fernandez, Fernando
    SIMULATION MODELLING PRACTICE AND THEORY, 2017, 74 : 117 - 133