Multiple model-based reinforcement learning

被引:295
|
作者
Doya, K [1 ]
Samejima, K
Katagiri, K
Kawato, M
机构
[1] ATR Int, Human Informat Sci Labs, Sora Ku, Kyoto 6190288, Japan
[2] Japan Sci & Technol Corp, ERATO, Kawato Dynam Brain Project, Sora Ku, Kyoto 6190288, Japan
[3] Nara Inst Sci & Technol, Nara 6300101, Japan
关键词
D O I
10.1162/089976602753712972
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The "responsibility signal," which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.
引用
下载
收藏
页码:1347 / 1369
页数:23
相关论文
共 50 条
  • [1] Multiple model-based reinforcement learning for nonlinear control
    Samejima, K
    Katagiri, K
    Doya, K
    Kawato, M
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (09): : 54 - 69
  • [2] Multiple-Timescale PIA for Model-Based Reinforcement Learning
    Yamaguchi, Tomohiro
    Imatani, Eri
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (06) : 658 - 666
  • [3] Multiple model-based reinforcement learning explains dopamine neuronal activity
    Bertin, Mathieu
    Schweighofer, Nicolas
    Doya, Kenji
    NEURAL NETWORKS, 2007, 20 (06) : 668 - 675
  • [4] The ubiquity of model-based reinforcement learning
    Doll, Bradley B.
    Simon, Dylan A.
    Daw, Nathaniel D.
    CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
  • [5] Model-based Reinforcement Learning: A Survey
    Moerland, Thomas M.
    Broekens, Joost
    Plaat, Aske
    Jonker, Catholijn M.
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
  • [6] A survey on model-based reinforcement learning
    Fan-Ming LUO
    Tian XU
    Hang LAI
    Xiong-Hui CHEN
    Weinan ZHANG
    Yang YU
    Science China(Information Sciences), 2024, 67 (02) : 59 - 84
  • [7] Nonparametric model-based reinforcement learning
    Atkeson, CG
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1008 - 1014
  • [8] A survey on model-based reinforcement learning
    Luo, Fan-Ming
    Xu, Tian
    Lai, Hang
    Chen, Xiong-Hui
    Zhang, Weinan
    Yu, Yang
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (02)
  • [9] Learning to Paint With Model-based Deep Reinforcement Learning
    Huang, Zhewei
    Heng, Wen
    Zhou, Shuchang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8708 - 8717
  • [10] Objective Mismatch in Model-based Reinforcement Learning
    Lambert, Nathan
    Amos, Brandon
    Yadan, Omry
    Calandra, Roberto
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 761 - 770