Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

被引:0
|
作者
Huang, Wenzhen [1 ,2 ]
Yin, Qiyue [1 ,2 ]
Zhang, Junge [1 ,2 ]
Huang, Kaiqi [1 ,2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, CRISE, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model-based reinforcement learning (RL) is more sample efficient than model-free RL by using imaginary trajectories generated by the learned dynamics model. When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions. To alleviate such problem, this paper proposes to adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories. More specifically, we evaluate the effect of an imaginary transition by calculating the change of the loss computed on the real samples when we use the transition to train the action-value and policy functions. Based on this evaluation criterion, we construct the idea of reweighting each imaginary transition by a well-designed meta-gradient algorithm. Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. Visualization of our changing weights further validates the necessity of utilizing reweight scheme.
引用
收藏
页码:7848 / 7856
页数:9
相关论文
共 50 条
  • [1] Learning to Paint With Model-based Deep Reinforcement Learning
    Huang, Zhewei
    Heng, Wen
    Zhou, Shuchang
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8708 - 8717
  • [2] A survey on model-based reinforcement learning
    Fan-Ming Luo
    Tian Xu
    Hang Lai
    Xiong-Hui Chen
    Weinan Zhang
    Yang Yu
    [J]. Science China Information Sciences, 2024, 67
  • [3] The ubiquity of model-based reinforcement learning
    Doll, Bradley B.
    Simon, Dylan A.
    Daw, Nathaniel D.
    [J]. CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
  • [4] Model-based Reinforcement Learning: A Survey
    Moerland, Thomas M.
    Broekens, Joost
    Plaat, Aske
    Jonker, Catholijn M.
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
  • [5] Nonparametric model-based reinforcement learning
    Atkeson, CG
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1008 - 1014
  • [6] A survey on model-based reinforcement learning
    Fan-Ming LUO
    Tian XU
    Hang LAI
    Xiong-Hui CHEN
    Weinan ZHANG
    Yang YU
    [J]. Science China(Information Sciences), 2024, 67 (02) : 59 - 84
  • [7] Multiple model-based reinforcement learning
    Doya, K
    Samejima, K
    Katagiri, K
    Kawato, M
    [J]. NEURAL COMPUTATION, 2002, 14 (06) : 1347 - 1369
  • [8] A survey on model-based reinforcement learning
    Luo, Fan-Ming
    Xu, Tian
    Lai, Hang
    Chen, Xiong-Hui
    Zhang, Weinan
    Yu, Yang
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (02)
  • [9] Incremental Learning of Planning Actions in Model-Based Reinforcement Learning
    Ng, Jun Hao Alvin
    Petrick, Ronald P. A.
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3195 - 3201
  • [10] Model gradient: unified model and policy learning in model-based reinforcement learning
    Chengxing Jia
    Fuxiang Zhang
    Tian Xu
    Jing-Cheng Pang
    Zongzhang Zhang
    Yang Yu
    [J]. Frontiers of Computer Science, 2024, 18