Acquisition of coordinated behavior by modular Q-learning agents

被引：0

作者：

Ono, N

Ikeda, O

Fukumoto, K

机构：

来源：

IROS 96 - PROCEEDINGS OF THE 1996 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - ROBOTIC INTELLIGENCE INTERACTING WITH DYNAMIC WORLDS, VOLS 1-3 | 1996年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent attempts to let monolithic reinforcement-learning agents synthesize coordinated behavior scale poorly to more complicated multi-agent learning problems where multiple learning agents play different roles and work together for the accomplishment of their common goal. These learning agents have to receive and respond to various sensory information from their partners as well as that from the physical environment itself Hence, their state spaces are subject to grow exponentially in the number of the partners. As an illustrative problem suffered from this kind of combinatorial explosion, we consider a modified version of the pursuit problem, and show how successfully a collection of modular Q-learning hunter agents synthesize coordinated decision policies needed to capture a randomly-fleeing prey agent effectively, by specializing their functionality and acquiring herding behavior.

引用

页码：1525 / 1529

页数：5

共 50 条

[21] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
Ohnishi, Shota
Uchibe, Eiji
Yamaguchi, Yotaro
Nakanishi, Kosuke
Yasui, Yuji
Ishii, Shin
FRONTIERS IN NEUROROBOTICS, 2019, 13
[22] Q-learning with a growing RBF network for behavior learning in mobile robotics
Li, J
Duckett, T
PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON ROBOTICS AND APPLICATIONS, 2005, : 273 - 278
[23] Adaptive and Coordinated Traffic Signal Control Based on Q-Learning and MULTIBAND Model
Lu, Shoufeng
Liu, Ximin
Dai, Shiqiang
2008 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 446 - +
[24] Learning rates for Q-Learning
Even-Dar, E
Mansour, Y
COMPUTATIONAL LEARNING THEORY, PROCEEDINGS, 2001, 2111 : 589 - 604
[25] Learning rates for Q-learning
Even-Dar, E
Mansour, Y
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 : 1 - 25
[26] Time Horizon Generalization in Reinforcement Learning: Generalizing Multiple Q-Tables in Q-Learning Agents
Hatcho, Yasuyo
Hattori, Kiyohiko
Takadama, Keiki
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (06) : 667 - 674
[27] Contextual Q-Learning
Pinto, Tiago
Vale, Zita
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2927 - 2928
[28] CVaR Q-Learning
Stanko, Silvestr
Macek, Karel
COMPUTATIONAL INTELLIGENCE: 11th International Joint Conference, IJCCI 2019, Vienna, Austria, September 17-19, 2019, Revised Selected Papers, 2021, 922 : 333 - 358
[29] Bayesian Q-learning
Dearden, R
Friedman, N
Russell, S
FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 761 - 768
[30] Zap Q-Learning
Devraj, Adithya M.
Meyn, Sean P.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30

← 1 2 3 4 5 →