Robust Reinforcement Learning via Progressive Task Sequence

被引:0
|
作者
Li, Yike [1 ]
Tian, Yunzhe [1 ]
Tong, Endong [1 ]
Niu, Wenjia [1 ]
Liu, Jiqiang [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Secur & Privacy Intelligent Trans, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robust reinforcement learning (RL) has been a challenging problem due to the gap between simulation and the real world. Existing efforts typically address the robust RL problem by solving a maxmin problem. The main idea is to maximize the cumulative reward under the worst-possible perturbations. However, the worst-case optimization either leads to overly conservative solutions or unstable training process, which further affects the policy robustness and generalization performance. In this paper, we tackle this problem from both formulation definition and algorithm design. First, we formulate the robust RL as a max-expectation optimization problem, where the goal is to find an optimal policy under both the worst cases and the non-worst cases. Then, we propose a novel framework DRRL to solve the max-expectation optimization. Given our definition of the feasible tasks, a task generation and sequencing mechanism is introduced to dynamically output tasks at appropriate difficulty level for the current policy. With these progressive tasks, DRRL realizes dynamic multi-task learning to improve the policy robustness and the training stability. Finally, extensive experiments demonstrate that the proposed method exhibits significant performance on the unmanned CarRacing game and multiple high-dimensional MuJoCo environments.
引用
收藏
页码:455 / 463
页数:9
相关论文
共 50 条
  • [41] Robust Adversarial Reinforcement Learning
    Pinto, Lerrel
    Davidson, James
    Sukthankar, Rahul
    Gupta, Abhinav
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [42] Robust reinforcement learning control
    Kretchmar, RM
    Young, PM
    Anderson, CW
    Hittle, DC
    Anderson, ML
    Tu, J
    Delnero, CC
    PROCEEDINGS OF THE 2001 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2001, : 902 - 907
  • [43] Robust Stuttering Detection via Multi-task and Adversarial Learning
    Sheikh, Shakeel A.
    Sahidullah, Md
    Hirsch, Fabrice
    Ouni, Slim
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 190 - 194
  • [44] Robust Visual Tracking via Multi-Task Sparse Learning
    Zhang, Tianzhu
    Ghanem, Bernard
    Liu, Si
    Ahuja, Narendra
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2042 - 2049
  • [45] MAML2: meta reinforcement learning via meta-learning for task categories
    FU Qiming
    WANG Zhechao
    FANG Nengwei
    XING Bin
    ZHANG Xiao
    CHEN Jianping
    Frontiers of Computer Science, 2023, 17 (04)
  • [46] MAML2: meta reinforcement learning via meta-learning for task categories
    Fu, Qiming
    Wang, Zhechao
    Fang, Nengwei
    Xing, Bin
    Zhang, Xiao
    Chen, Jianping
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (04)
  • [47] Inductive Embedding Learning on Attributed Heterogeneous Networks via Multi-task Sequence-to-Sequence Learning
    Chu, Yunfei
    Guo, Caili
    He, Tongze
    Wang, Yaqing
    Hwang, Jenq-Neng
    Feng, Chunyan
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1012 - 1017
  • [48] Task Inference for Offline Meta Reinforcement Learning via Latent Shared Knowledge
    Zhou, Ying
    Cong, Shan
    Yu, Chao
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 356 - 365
  • [49] Heterogeneous Multi-robot Task Allocation and Scheduling via Reinforcement Learning
    Dai, Weiheng
    Rai, Utkarsh
    Chiun, Jimmy
    Cao, Yuhong
    Sartoretti, Guillaume
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2654 - 2661
  • [50] A stable method for task priority adaptation in quadratic programming via reinforcement learning
    Testa, Andrea
    Laghi, Marco
    Del Bianco, Edoardo
    Raiola, Gennaro
    Hoffman, Enrico Mingo
    Ajoudani, Arash
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2025, 91