Acquisition of coordinated behavior by modular Q-learning agents

被引:0
|
作者
Ono, N
Ikeda, O
Fukumoto, K
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent attempts to let monolithic reinforcement-learning agents synthesize coordinated behavior scale poorly to more complicated multi-agent learning problems where multiple learning agents play different roles and work together for the accomplishment of their common goal. These learning agents have to receive and respond to various sensory information from their partners as well as that from the physical environment itself Hence, their state spaces are subject to grow exponentially in the number of the partners. As an illustrative problem suffered from this kind of combinatorial explosion, we consider a modified version of the pursuit problem, and show how successfully a collection of modular Q-learning hunter agents synthesize coordinated decision policies needed to capture a randomly-fleeing prey agent effectively, by specializing their functionality and acquiring herding behavior.
引用
收藏
页码:1525 / 1529
页数:5
相关论文
共 50 条
  • [21] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [22] Q-learning with a growing RBF network for behavior learning in mobile robotics
    Li, J
    Duckett, T
    PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON ROBOTICS AND APPLICATIONS, 2005, : 273 - 278
  • [23] Adaptive and Coordinated Traffic Signal Control Based on Q-Learning and MULTIBAND Model
    Lu, Shoufeng
    Liu, Ximin
    Dai, Shiqiang
    2008 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 446 - +
  • [24] Learning rates for Q-Learning
    Even-Dar, E
    Mansour, Y
    COMPUTATIONAL LEARNING THEORY, PROCEEDINGS, 2001, 2111 : 589 - 604
  • [25] Learning rates for Q-learning
    Even-Dar, E
    Mansour, Y
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 : 1 - 25
  • [26] Time Horizon Generalization in Reinforcement Learning: Generalizing Multiple Q-Tables in Q-Learning Agents
    Hatcho, Yasuyo
    Hattori, Kiyohiko
    Takadama, Keiki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (06) : 667 - 674
  • [27] Contextual Q-Learning
    Pinto, Tiago
    Vale, Zita
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2927 - 2928
  • [28] CVaR Q-Learning
    Stanko, Silvestr
    Macek, Karel
    COMPUTATIONAL INTELLIGENCE: 11th International Joint Conference, IJCCI 2019, Vienna, Austria, September 17-19, 2019, Revised Selected Papers, 2021, 922 : 333 - 358
  • [29] Bayesian Q-learning
    Dearden, R
    Friedman, N
    Russell, S
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 761 - 768
  • [30] Zap Q-Learning
    Devraj, Adithya M.
    Meyn, Sean P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30