Robust Reinforcement Learning via Progressive Task Sequence

被引:0
|
作者
Li, Yike [1 ]
Tian, Yunzhe [1 ]
Tong, Endong [1 ]
Niu, Wenjia [1 ]
Liu, Jiqiang [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Secur & Privacy Intelligent Trans, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robust reinforcement learning (RL) has been a challenging problem due to the gap between simulation and the real world. Existing efforts typically address the robust RL problem by solving a maxmin problem. The main idea is to maximize the cumulative reward under the worst-possible perturbations. However, the worst-case optimization either leads to overly conservative solutions or unstable training process, which further affects the policy robustness and generalization performance. In this paper, we tackle this problem from both formulation definition and algorithm design. First, we formulate the robust RL as a max-expectation optimization problem, where the goal is to find an optimal policy under both the worst cases and the non-worst cases. Then, we propose a novel framework DRRL to solve the max-expectation optimization. Given our definition of the feasible tasks, a task generation and sequencing mechanism is introduced to dynamically output tasks at appropriate difficulty level for the current policy. With these progressive tasks, DRRL realizes dynamic multi-task learning to improve the policy robustness and the training stability. Finally, extensive experiments demonstrate that the proposed method exhibits significant performance on the unmanned CarRacing game and multiple high-dimensional MuJoCo environments.
引用
收藏
页码:455 / 463
页数:9
相关论文
共 50 条
  • [31] Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning
    Yoo, Se-Wook
    Seo, Seung-Woo
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022,
  • [32] Learning task-relevant representations via rewards and real actions for reinforcement learning
    Yuan, Linghui
    Lu, Xiaowei
    Liu, Yunlong
    KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [33] Robust Adversarial Reinforcement Learning for Optimal Assembly Sequence Definition in a Cobot Workcell
    Alessio, Alessandro
    Aliev, Khurshid
    Antonelli, Dario
    ADVANCES IN MANUFACTURING III, VOL 2: PRODUCTION ENGINEERING: RESEARCH AND TECHNOLOGY INNOVATIONS, INDUSTRY 4.0, 2022, : 25 - 34
  • [34] Model-free robust reinforcement learning via Polynomial Chaos
    Liu, Jianxiang
    Wu, Faguo
    Zhang, Xiao
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [35] Robust Formation Control for Cooperative Underactuated Quadrotors via Reinforcement Learning
    Zhao, Wanbing
    Liu, Hao
    Lewis, Frank L.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4577 - 4587
  • [36] Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
    Qin, Pengda
    Xu, Weiran
    Wang, William Yang
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2137 - 2147
  • [37] Resilient Dynamic Channel Access via Robust Deep Reinforcement Learning
    Wang, Feng
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    IEEE ACCESS, 2021, 9 : 163188 - 163203
  • [38] Robust Energy Management Policies for Solar Microgrids via Reinforcement Learning
    Jones, Gerald
    Li, Xueping
    Sun, Yulin
    ENERGIES, 2024, 17 (12)
  • [39] Robust Contact-Rich Task Learning With Reinforcement Learning and Curriculum-Based Domain Randomization
    Aflakian, Ali
    Hathaway, Jamie
    Stolkin, Rustam
    Rastegarpanah, Alireza
    IEEE ACCESS, 2024, 12 : 103461 - 103472
  • [40] Sequence generation for multi-task scheduling in cloud manufacturing with deep reinforcement learning
    Ping, Yaoyao
    Liu, Yongkui
    Zhang, Lin
    Wang, Lihui
    Xu, Xun
    JOURNAL OF MANUFACTURING SYSTEMS, 2023, 67 : 315 - 337