Time Horizon Generalization in Reinforcement Learning: Generalizing Multiple Q-Tables in Q-Learning Agents

被引:1
|
作者
Hatcho, Yasuyo [1 ]
Hattori, Kiyohiko [1 ]
Takadama, Keiki [1 ,2 ]
机构
[1] Univ Electrocommun, 1-5-1 Chofugaoka, Chofu, Tokyo 1828585, Japan
[2] Japan Sci & Technol Agcy JST, PRESTO, Kawaguchi, Saitama 3320012, Japan
关键词
generalization; time horizon; sequential interaction; reinforcement learning;
D O I
10.20965/jaciii.2009.p0667
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on generalization in reinforcement learning from the time horizon viewpoint, exploring the method that generalizes multiple Q-tables in the multiagent reinforcement learning domain. For this purpose, we propose time horizon generalization for reinforcement learning, which consists of (1) Q-table selection method and (2) Q-table merge timing method, enabling agents to (1) select which Q-tables can be generalized from among many Q-tables and (2) determine when the selected Q-tables should be generalized. Intensive simulation on the bargaining game as sequential interaction game have revealed the following implications: (1) both Q-table selection and merging timing methods help replicate the subject experimental results without ad-hoc parameter setting; and (2) such replication succeeds by agents using the proposed methods with smaller numbers of Q-tables.
引用
收藏
页码:667 / 674
页数:8
相关论文
共 50 条
  • [1] New reinforcement learning method using multiple Q-tables
    Park, MS
    Choi, AY
    [J]. 6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XI, PROCEEDINGS: COMPUTER SCIENCE II, 2002, : 88 - 92
  • [2] Fuzzy Q-Learning for generalization of reinforcement learning
    Berenji, HR
    [J]. FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2208 - 2214
  • [3] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [4] A generalization error for Q-learning
    Murphy, Susan A.
    [J]. Journal of Machine Learning Research, 2005, 6
  • [5] A generalization error for Q-learning
    Murphy, SA
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2005, 6 : 1073 - 1097
  • [6] Reinforcement distribution in a team of cooperative Q-learning agents
    Abbasi, Zahra
    Abbasi, Mohammad Ali
    [J]. PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 154 - +
  • [7] Deep Reinforcement Learning with Double Q-Learning
    van Hasselt, Hado
    Guez, Arthur
    Silver, David
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
  • [8] Reinforcement learning guidance law of Q-learning
    Zhang, Qinhao
    Ao, Baiqiang
    Zhang, Qinxue
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2020, 42 (02): : 414 - 419
  • [9] Reinforcement distribution in fuzzy Q-learning
    Bonarini, Andrea
    Lazaric, Alessandro
    Montrone, Francesco
    Restelli, Marcello
    [J]. FUZZY SETS AND SYSTEMS, 2009, 160 (10) : 1420 - 1443
  • [10] Multiple-Model Q-Learning for Stochastic Reinforcement Delays
    Campbell, Jeffrey S.
    Givigi, Sidney N.
    Schwartz, Howard M.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 1611 - 1617