Budgeted Reinforcement Learning in Continuous State Space

被引:0
|
作者
Carrara, Nicolas [1 ,6 ]
Leurent, Edouard [1 ,2 ,6 ]
Laroche, Romain [3 ]
Urvoy, Tanguy [4 ]
Maillard, Odalric-Ambrym [1 ]
Pietquin, Olivier [1 ,5 ,6 ]
机构
[1] INRIA Lille, Nord Europe, SequeL Team, Lille, France
[2] Renault Grp, Boulogne, France
[3] Microsoft Res, Montreal, PQ, Canada
[4] Orange Labs, Lannion, France
[5] Google Res, Brain Team, Mountain View, CA USA
[6] Univ Lille, CNRS, Cent Lille, CRIStAL,INRIA,UMR 9189, Lille, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints. It relies on a notion of risk implemented in the shape of a cost signal constrained to lie below an - adjustable - threshold. So far, BMDPs could only be solved in the case of finite state spaces with known dynamics. This work extends the state-of-the-art to continuous spaces environments and unknown dynamics. We show that the solution to a BMDP is a fixed point of a novel Budgeted Bellman Optimality operator. This observation allows us to introduce natural extensions of Deep Reinforcement Learning algorithms to address large-scale BMDPs. We validate our approach on two simulated applications: spoken dialogue and autonomous driving(3).
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A state space filter for reinforcement learning in POMDPs - Application to a continuous state space -
    Nagayoshi, Masato
    Murao, Hajime
    Tamaki, Hisashi
    [J]. 2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 3098 - +
  • [2] Budgeted Hierarchical Reinforcement Learning
    Leon, Aurelia
    Denoyer, Ludovic
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [3] On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems
    Goyal, Raman
    Chakravorty, Suman
    Wang, Ran
    Mohamed, Mohamed Naveed Gul
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2969 - 2975
  • [4] Tree based discretization for continuous state space reinforcement learning
    Uther, WTB
    Veloso, MM
    [J]. FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 769 - 774
  • [5] Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
    Dexter, Gregory
    Bello, Kevin
    Honorio, Jean
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Reinforcement learning in continuous time and space
    Doya, K
    [J]. NEURAL COMPUTATION, 2000, 12 (01) : 219 - 245
  • [7] BEHAVIOR ACQUISITION ON A MOBILE ROBOT USING REINFORCEMENT LEARNING WITH CONTINUOUS STATE SPACE
    Arai, Tomoyuki
    Toda, Yuichiro
    Kubota, Naoyuki
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 458 - 461
  • [8] Swarm Reinforcement Learning Methods for Problems with Continuous State-Action Space
    Iima, Hitoshi
    Kuroe, Yasuaki
    Emoto, Kazuo
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2173 - 2180
  • [9] Reinforcement Learning Method for Continuous State Space Based on Dynamic Neural Network
    Sun, Wei
    Wang, Xuesong
    Cheng, Yuhu
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 750 - 754
  • [10] Switching reinforcement learning for continuous action space
    Nagayoshi, Masato
    Murao, Hajime
    Tamaki, Hisashi
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2012, 95 (03) : 37 - 44