Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

被引:0
|
作者
Nikishin, Evgenii [1 ]
Abachi, Romina [2 ]
Agarwal, Rishabh [1 ,3 ]
Bacon, Pierre-Luc [1 ,4 ]
机构
[1] Univ Montreal, Mila, Montreal, PQ, Canada
[2] Univ Toronto, Vector Inst, Toronto, ON, Canada
[3] Google Res, Mountain View, CA USA
[4] Facebook CIFAR AI Chair, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The shortcomings of maximum likelihood estimation in the context of model-based reinforcement learning have been highlighted by an increasing number of papers. When the model class is misspecified or has a limited representational capacity, model parameters with high likelihood might not necessarily result in high performance of the agent on a downstream control task. To alleviate this problem, we propose an end-to-end approach for model learning which directly optimizes the expected returns using implicit differentiation. We treat a value function that satisfies the Bellman optimality operator induced by the model as an implicit function of model parameters and show how to differentiate the function. We provide theoretical and empirical evidence highlighting the benefits of our approach in the model misspecification regime compared to likelihood-based methods.
引用
收藏
页码:7886 / 7894
页数:9
相关论文
共 50 条
  • [1] Model-Based Reinforcement Learning For Robot Control
    Li, Xiang
    Shang, Weiwei
    Cong, Shuang
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 300 - 305
  • [2] Control Approach Combining Reinforcement Learning and Model-Based Control
    Okawa, Yoshihiro
    Sasaki, Tomotake
    Iwane, Hidenao
    [J]. 2019 12TH ASIAN CONTROL CONFERENCE (ASCC), 2019, : 1419 - 1424
  • [3] Efficient reinforcement learning: Model-based acrobot control
    Boone, G
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 229 - 234
  • [4] Multiple model-based reinforcement learning for nonlinear control
    Samejima, K
    Katagiri, K
    Doya, K
    Kawato, M
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (09): : 54 - 69
  • [5] Offline Model-Based Reinforcement Learning for Tokamak Control
    Char, Ian
    Abbate, Joseph
    Bardoczi, Laszlo
    Boyer, Mark D.
    Chung, Youngseog
    Conlin, Rory
    Erickson, Keith
    Mehta, Viraj
    Richner, Nathan
    Kolemen, Egemen
    Schneider, Jeff
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [6] Model-Based Comparative Evaluation of Building and District Control-Oriented Energy Retrofit Scenarios
    De Tommasi, Luciano
    Ridouane, Hassan
    Giannakis, Georgios
    Katsigarakis, Kyriakos
    Lilis, Georgios N.
    Rovas, Dimitrios
    [J]. BUILDINGS, 2018, 8 (07)
  • [7] Control-Oriented Learning on the Fly
    Ornik, Melkior
    Carr, Steven
    Israel, Arie
    Topcu, Ufuk
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (11) : 4800 - 4807
  • [8] Fault Tolerant Control combining Reinforcement Learning and Model-based Control
    Bhan, Luke
    Quinones-Grueiro, Marcos
    Biswas, Gautam
    [J]. 5TH CONFERENCE ON CONTROL AND FAULT-TOLERANT SYSTEMS (SYSTOL 2021), 2021, : 31 - 36
  • [9] Cognitive Control Predicts Use of Model-based Reinforcement Learning
    Otto, A. Ross
    Skatova, Anya
    Madlon-Kay, Seth
    Daw, Nathaniel D.
    [J]. JOURNAL OF COGNITIVE NEUROSCIENCE, 2015, 27 (02) : 319 - 333
  • [10] Model-based hierarchical reinforcement learning and human action control
    Botvinick, Matthew
    Weinstein, Ari
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1655)