Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning

被引:1
|
作者
Zhao, Dongfang [1 ]
Liu, Jiafeng [1 ]
Wu, Rui [1 ]
Cheng, Dansong [1 ]
Tang, Xianglong [1 ]
机构
[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China
来源
IEEE ACCESS | 2019年 / 7卷
基金
美国国家科学基金会;
关键词
Reinforcement learning; information entropy; optimistic sampling; data efficiency;
D O I
10.1109/ACCESS.2019.2913001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A high required number of interactions with the environment is one of the most important problems in reinforcement learning (RL). To deal with this problem, several data-efficient RL algorithms have been proposed and successfully applied in practice. Unlike previous research, that focuses on optimal policy evaluation and policy improvement stages, we actively select informative samples by leveraging entropy-based optimal sampling strategy, which takes the initial samples set into consideration. During the initial sampling process, information entropy is used to describe the potential samples. The agent selects the most informative samples using an optimization method. This way, the initial sample is more informative than in random and fixed strategy. Therefore, a more accurate initial dynamic model and policy can be learned. Thus, the proposed optimal sampling method guides the agent to search in a more informative region. The experimental results on standard benchmark problems involving a pendulum, cart pole, and cart double pendulum show that our optimal sampling strategy has a better performance in terms of data efficiency.
引用
收藏
页码:55763 / 55769
页数:7
相关论文
共 50 条
  • [1] Data-Efficient Hierarchical Reinforcement Learning
    Nachum, Ofir
    Gu, Shixiang
    Lee, Honglak
    Levine, Sergey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning
    Zhong, Rujie
    Zhang, Duohan
    Schafer, Lukas
    Albrecht, Stefano V.
    Hanna, Josiah P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Data-Efficient Reinforcement Learning for Malaria Control
    Zou, Lixin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
  • [4] Pretraining Representations for Data-Efficient Reinforcement Learning
    Schwarzer, Max
    Rajkumar, Nitarshan
    Noukhovitch, Michael
    Anand, Ankesh
    Charlin, Laurent
    Hjelm, Devon
    Bachman, Philip
    Courville, Aaron
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
    Mondal, Arnab Kumar
    Jain, Vineet
    Siddiqi, Kaleem
    Ravanbakhsh, Siamak
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Data-Efficient Reinforcement Learning for Variable Impedance Control
    Anand, Akhil S.
    Kaushik, Rituraj
    Gravdahl, Jan Tommy
    Abu-Dakka, Fares J.
    [J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
  • [7] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
    Cagatan, Omer Veysel
    Akgun, Baris
    [J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [8] Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data
    Nie, Allen
    Flet-Berliac, Yannis
    Jordan, Deon R.
    Steenbergen, William
    Brunskill, Emma
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
    Angelotti, Giorgio
    Drougard, Nicolas
    Chanel, Caroline P. C.
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
  • [10] Concurrent Credit Assignment for Data-efficient Reinforcement Learning
    Dauce, Emmanuel
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,