Safe Reinforcement Learning Using Wasserstein Distributionally Robust MPC and Chance Constraint

被引:5
|
作者
Kordabad, Arash Bahari [1 ]
Wisniewski, Rafael [2 ]
Gros, Sebastien [1 ]
机构
[1] Norwegian Univ Sci & Technol NTNU, Dept Engn Cybernet, N-7034 Trondheim, Norway
[2] Aalborg Univ, Dept Elect Syst, DK-9220 Aalborg, Denmark
关键词
Safe reinforcement learning; model predictive control; distributionally robust optimization; chance constraint; conditional value at risk; Q-learning; OPTIMIZATION;
D O I
10.1109/ACCESS.2022.3228922
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we address the chance-constrained safe Reinforcement Learning (RL) problem using the function approximators based on Stochastic Model Predictive Control (SMPC) and Distributionally Robust Model Predictive Control (DRMPC). We use Conditional Value at Risk (CVaR) to measure the probability of constraint violation and safety. In order to provide a safe policy by construction, we first propose using parameterized nonlinear DRMPC at each time step. DRMPC optimizes a finite-horizon cost function subject to the worst-case constraint violation in an ambiguity set. We use a statistical ball around the empirical distribution with a radius measured by the Wasserstein metric as the ambiguity set. Unlike the sample average approximation SMPC, DRMPC provides a probabilistic guarantee of the out-of-sample risk and requires lower samples from the disturbance. Then the Q-learning method is used to optimize the parameters in the DRMPC to achieve the best closed-loop performance. Wheeled Mobile Robot (WMR) path planning with obstacle avoidance will be considered to illustrate the efficiency of the proposed method.
引用
收藏
页码:130058 / 130067
页数:10
相关论文
共 50 条
  • [21] Data-driven distributionally robust chance-constrained optimization with Wasserstein metric
    Ji, Ran
    Lejeune, Miguel A.
    JOURNAL OF GLOBAL OPTIMIZATION, 2021, 79 (04) : 779 - 811
  • [22] DISTRIBUTIONALLY ROBUST CHANCE CONSTRAINED SVM MODEL WITH l2-WASSERSTEIN DISTANCE
    Ma, Qing
    Wang, Yanjun
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2023, 19 (02) : 916 - 931
  • [23] Data-driven distributionally robust chance-constrained optimization with Wasserstein metric
    Ran Ji
    Miguel A. Lejeune
    Journal of Global Optimization, 2021, 79 : 779 - 811
  • [24] Automatic Exploration Process Adjustment for Safe Reinforcement Learning with Joint Chance Constraint Satisfaction
    Okawa, Yoshihiro
    Sasaki, Tomotake
    Iwane, Hidenao
    IFAC PAPERSONLINE, 2020, 53 (02): : 1588 - 1595
  • [25] Principled Learning Method for Wasserstein Distributionally Robust Optimization with Local Perturbations
    Kwon, Yongchan
    Kim, Wonyoung
    Won, Joong-Ho
    Paik, Myunghee Cho
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [26] Distributionally robust chance constraint with unimodality-skewness information and conic reformulation
    Shang, Chao
    Wang, Chao
    You, Keyou
    Huang, Dexian
    OPERATIONS RESEARCH LETTERS, 2022, 50 (02) : 176 - 183
  • [27] A Wasserstein distributionally robust chance constrained programming approach for emergency medical system planning problem
    Yuan, Yuefei
    Song, Qiankun
    Zhou, Bo
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2022, 53 (10) : 2136 - 2148
  • [28] CVaR-Based Approximations of Wasserstein Distributionally Robust Chance Constraints with Application to Process Scheduling
    Liu, Botong
    Zhang, Qi
    Ge, Xiaolong
    Yuan, Zhihong
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2020, 59 (20) : 9562 - 9574
  • [29] Wasserstein distributionally robust risk-constrained iterative MPC for motion planning: computationally efficient approximations
    Zolanvari, Alireza
    Cherukuri, Ashish
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 2022 - 2029
  • [30] Distributionally Robust Quickest Change Detection using Wasserstein Uncertainty Sets
    Xie, Liyan
    Liang, Yuchen
    Veeravalli, Venugopal V.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238