Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

被引:0
|
作者
Huang, Wenzhen [1 ,2 ]
Yin, Qiyue [1 ,2 ]
Zhang, Junge [1 ,2 ]
Huang, Kaiqi [1 ,2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, CRISE, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model-based reinforcement learning (RL) is more sample efficient than model-free RL by using imaginary trajectories generated by the learned dynamics model. When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions. To alleviate such problem, this paper proposes to adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories. More specifically, we evaluate the effect of an imaginary transition by calculating the change of the loss computed on the real samples when we use the transition to train the action-value and policy functions. Based on this evaluation criterion, we construct the idea of reweighting each imaginary transition by a well-designed meta-gradient algorithm. Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. Visualization of our changing weights further validates the necessity of utilizing reweight scheme.
引用
收藏
页码:7848 / 7856
页数:9
相关论文
共 50 条
  • [31] Online Constrained Model-based Reinforcement Learning
    van Niekerk, Benjamin
    Damianou, Andreas
    Rosman, Benjamin
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [32] Calibrated Model-Based Deep Reinforcement Learning
    Malik, Ali
    Kuleshov, Volodymyr
    Song, Jiaming
    Nemer, Danny
    Seymour, Harlan
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [33] Learning to Attack Federated Learning: A Model-based Reinforcement Learning Attack Framework
    Li, Henger
    Sun, Xiaolin
    Zheng, Zizhan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] Model-Based Reinforcement Learning for Quantized Federated Learning Performance Optimization
    Yang, Nuocheng
    Wang, Sihua
    Chen, Mingzhe
    Brinton, Christopher G.
    Yin, Changchuan
    Saad, Walid
    Cui, Shuguang
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5063 - 5068
  • [35] Model-based reinforcement learning by pyramidal neurons: Robustness of the learning rule
    Eisele, M
    Sejnowski, T
    PROCEEDINGS OF THE 4TH JOINT SYMPOSIUM ON NEURAL COMPUTATION, VOL 7, 1997, : 83 - 90
  • [36] Weighted model estimation for offline model-based reinforcement learning
    Hishinuma, Toru
    Senda, Kei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [37] Latent Causal Dynamics Model for Model-Based Reinforcement Learning
    Hao, Zhifeng
    Zhu, Haipeng
    Chen, Wei
    Cai, Ruichu
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 219 - 230
  • [38] Model-based reinforcement learning with model error and its application
    Tajima, Yoshiyuki
    Onisawa, Takehisa
    PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-8, 2007, : 1333 - 1336
  • [39] Model-based reinforcement learning: a computational model and an fMRI study
    Yoshida, W
    Ishii, S
    NEUROCOMPUTING, 2005, 63 : 253 - 269
  • [40] Reward Shaping for Model-Based Bayesian Reinforcement Learning
    Kim, Hyeoneun
    Lim, Woosang
    Lee, Kanghoon
    Noh, Yung-Kyun
    Kim, Kee-Eung
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3548 - 3555