Boundary Data Augmentation for Offline Reinforcement Learning

被引:0
|
作者
SHEN Jiahao [1 ,2 ]
JIANG Ke [1 ,2 ]
TAN Xiaoyang [1 ,2 ]
机构
[1] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
[2] MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
基金
国家重点研发计划; 美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP181 [自动推理、机器学习];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline reinforcement learning(ORL) aims to learn a rational agent purely from behavior data without any online interaction. One of the major challenges encountered in ORL is the problem of distribution shift, i.e., the mismatch between the knowledge of the learned policy and the reality of the underlying environment. Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD) queries as much as possible, but this can influence the robustness of the agents at unseen states. In this paper, we propose a simple but effective method to address this issue. The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions, and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN) such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves, with regard to the behavior policy or some other reference policy. We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions. Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method.
引用
收藏
页码:29 / 36
页数:8
相关论文
共 50 条
  • [1] Selective Data Augmentation for Improving the Performance of Offline Reinforcement Learning
    Han, Jungwoo
    Kim, Jinwhan
    [J]. 2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 222 - 226
  • [2] Uncertainty-Aware Data Augmentation for Offline Reinforcement Learning
    Su, Yunjie
    Kong, Yilun
    Wang, Xueqian
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] A Swapping Target Q-Value Technique for Data Augmentation in Offline Reinforcement Learning
    Joo, Ho-Taek
    Baek, In-Chang
    Kim, Kyung-Joong
    [J]. IEEE ACCESS, 2022, 10 : 57369 - 57382
  • [4] Robust Reinforcement Learning using Offline Data
    Panaganti, Kishan
    Xu, Zaiyan
    Kalathil, Dileep
    Ghavamzadeh, Mohammad
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Federated Offline Reinforcement Learning With Multimodal Data
    Wen, Jiabao
    Dai, Huiao
    He, Jingyi
    Xi, Meng
    Xiao, Shuai
    Yang, Jiachen
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 4266 - 4276
  • [6] Efficient Online Reinforcement Learning with Offline Data
    Ball, Philip J.
    Smith, Laura
    Kostrikov, Ilya
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [7] K-mixup: Data augmentation for offline reinforcement learning using mixup in a Koopman invariant subspace
    Jang, Junwoo
    Han, Jungwoo
    Kim, Jinwhan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 225
  • [8] How to Leverage Unlabeled Data in Offline Reinforcement Learning
    Yu, Tianhe
    Kumar, Aviral
    Chebotar, Yevgen
    Hausman, Karol
    Finn, Chelsea
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning
    Cho, Daesol
    Shim, Dongseok
    Kim, H. Jin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] State Augmentation via Self-Supervision in Offline Multiagent Reinforcement Learning
    Wang, Siying
    Li, Xiaodie
    Qu, Hong
    Chen, Wenyu
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (03) : 1051 - 1062