Adaptive Discretization for Model-Based Reinforcement Learning

被引:0
|
作者
Sinclair, Sean R. [1 ]
Wang, Tianyu [2 ]
Jain, Gauri [1 ]
Banerjee, Siddhartha [1 ]
Yu, Christina Lee [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
[2] Duke Univ, Durham, NC 27706 USA
关键词
GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective we provide worst-case regret bounds for our algorithm which are competitive compared to the state-of-the-art model-based algorithms. Moreover, our bounds are obtained via a modular proof technique which can potentially extend to incorporate additional structure on the problem. From an implementation standpoint, our algorithm has much lower storage and computational requirements due to maintaining a more efficient partition of the state and action spaces. We illustrate this via experiments on several canonical control problems, which shows that our algorithm empirically performs significantly better than fixed discretization in terms of both faster convergence and lower memory usage. Interestingly, we observe empirically that while fixed discretization model-based algorithms vastly outperform their model-free counterparts, the two achieve comparable performance with adaptive discretization.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
    Ma, Yecheng Jason
    Shen, Andrew
    Bastani, Osbert
    Jayaraman, Dinesh
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 5404 - 5412
  • [2] Adaptive optics control using model-based reinforcement learning
    Nousiainen, Jalo
    Rajani, Chang
    Kasper, Markus
    Helin, Tapio
    [J]. OPTICS EXPRESS, 2021, 29 (10): : 15327 - 15344
  • [3] Advances in model-based reinforcement learning for Adaptive Optics control
    Nousiainen, Jalo
    Engler, Byron
    Kasper, Markus
    Helin, Tapio
    Heritier, Cedric T.
    Rajani, Chang
    [J]. ADAPTIVE OPTICS SYSTEMS VIII, 2022, 12185
  • [4] Adaptive Discretization in Online Reinforcement Learning
    Sinclair, Sean R.
    Banerjee, Siddhartha
    Yu, Christina Lee
    [J]. OPERATIONS RESEARCH, 2023, 71 (05) : 1636 - 1652
  • [5] BiES: Adaptive Policy Optimization for Model-Based Offline Reinforcement Learning
    Yang, Yijun
    Jiang, Jing
    Wang, Zhuowei
    Duan, Qiqi
    Shi, Yuhui
    [J]. AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 : 570 - 581
  • [6] Laboratory experiments of model-based reinforcement learning for adaptive optics control
    Nousiainen, Jalo
    Engler, Byron
    Kasper, Markus
    Rajani, Chang
    Helin, Tapio
    Heritier, Cédric T.
    Quanz, Sascha P.
    Glauser, Adrian M.
    [J]. Journal of Astronomical Telescopes, Instruments, and Systems, 2024, 10 (01)
  • [7] A survey on model-based reinforcement learning
    Fan-Ming Luo
    Tian Xu
    Hang Lai
    Xiong-Hui Chen
    Weinan Zhang
    Yang Yu
    [J]. Science China Information Sciences, 2024, 67
  • [8] The ubiquity of model-based reinforcement learning
    Doll, Bradley B.
    Simon, Dylan A.
    Daw, Nathaniel D.
    [J]. CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
  • [9] Model-based Reinforcement Learning: A Survey
    Moerland, Thomas M.
    Broekens, Joost
    Plaat, Aske
    Jonker, Catholijn M.
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
  • [10] Nonparametric model-based reinforcement learning
    Atkeson, CG
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1008 - 1014