On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems

被引:0
|
作者
Goyal, Raman [1 ]
Chakravorty, Suman [1 ]
Wang, Ran [1 ]
Mohamed, Mohamed Naveed Gul [1 ]
机构
[1] Texas A&M Univ, Dept Aerosp Engn, College Stn, TX 77843 USA
关键词
RL; Optimal control; Nonlinear systems;
D O I
10.1109/CDC45484.2021.9682829
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of Reinforcement Learning for nonlinear stochastic dynamical systems. We show that in the RL setting, there is an inherent "Curse of Variance" in addition to Bellman's infamous "Curse of Dimensionality", in particular, we show that the variance in the solution grows factorial-exponentially in the order of the approximation. A fundamental consequence is that this precludes the search for anything other than "local" feedback solutions in RL, in order to control the explosive variance growth, and thus, ensure accuracy. We further show that the deterministic optimal control has a perturbation structure, in that the higher order terms do not affect the calculation of lower order terms, which can be utilized in RL to get accurate local solutions.
引用
收藏
页码:2969 / 2975
页数:7
相关论文
共 50 条
  • [1] Swarm Reinforcement Learning Methods for Problems with Continuous State-Action Space
    Iima, Hitoshi
    Kuroe, Yasuaki
    Emoto, Kazuo
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2173 - 2180
  • [2] Budgeted Reinforcement Learning in Continuous State Space
    Carrara, Nicolas
    Leurent, Edouard
    Laroche, Romain
    Urvoy, Tanguy
    Maillard, Odalric-Ambrym
    Pietquin, Olivier
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] A state space filter for reinforcement learning in POMDPs - Application to a continuous state space -
    Nagayoshi, Masato
    Murao, Hajime
    Tamaki, Hisashi
    [J]. 2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 3098 - +
  • [4] Tree based discretization for continuous state space reinforcement learning
    Uther, WTB
    Veloso, MM
    [J]. FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 769 - 774
  • [5] Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
    Dexter, Gregory
    Bello, Kevin
    Honorio, Jean
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Experiments with reinforcement learning in problems with continuous state and action spaces
    Santamaria, JC
    Sutton, RS
    Ram, A
    [J]. ADAPTIVE BEHAVIOR, 1997, 6 (02) : 163 - 217
  • [7] Reinforcement learning in continuous time and space
    Doya, K
    [J]. NEURAL COMPUTATION, 2000, 12 (01) : 219 - 245
  • [8] BEHAVIOR ACQUISITION ON A MOBILE ROBOT USING REINFORCEMENT LEARNING WITH CONTINUOUS STATE SPACE
    Arai, Tomoyuki
    Toda, Yuichiro
    Kubota, Naoyuki
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 458 - 461
  • [9] Reinforcement Learning Method for Continuous State Space Based on Dynamic Neural Network
    Sun, Wei
    Wang, Xuesong
    Cheng, Yuhu
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 750 - 754
  • [10] CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA Performance and Convergence
    Rodriguez, Abdel
    Grau, Ricardo
    Nowe, Aim
    [J]. ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2011, : 473 - 478