Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning

被引:1
|
作者
Zhao, Dongfang [1 ]
Liu, Jiafeng [1 ]
Wu, Rui [1 ]
Cheng, Dansong [1 ]
Tang, Xianglong [1 ]
机构
[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China
基金
美国国家科学基金会;
关键词
Reinforcement learning; information entropy; optimistic sampling; data efficiency;
D O I
10.1109/ACCESS.2019.2913001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A high required number of interactions with the environment is one of the most important problems in reinforcement learning (RL). To deal with this problem, several data-efficient RL algorithms have been proposed and successfully applied in practice. Unlike previous research, that focuses on optimal policy evaluation and policy improvement stages, we actively select informative samples by leveraging entropy-based optimal sampling strategy, which takes the initial samples set into consideration. During the initial sampling process, information entropy is used to describe the potential samples. The agent selects the most informative samples using an optimization method. This way, the initial sample is more informative than in random and fixed strategy. Therefore, a more accurate initial dynamic model and policy can be learned. Thus, the proposed optimal sampling method guides the agent to search in a more informative region. The experimental results on standard benchmark problems involving a pendulum, cart pole, and cart double pendulum show that our optimal sampling strategy has a better performance in terms of data efficiency.
引用
收藏
页码:55763 / 55769
页数:7
相关论文
共 50 条
  • [31] Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs
    Feng, Guoxi
    Busoniu, Lucian
    Guerra, Thierry-Marie
    Mohammad, Sami
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9734 - 9744
  • [32] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
    Liu, Shiyu
    Cao, Guitao
    Liu, Yong
    Li, Yan
    Wu, Chunwei
    Xi, Xidong
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [33] Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning
    Luck, Kevin Sebastian
    Ben Amor, Heni
    Calandra, Roberto
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [34] SDRL: Interpretable and Data-Efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2970 - 2977
  • [35] Load Balancing for Communication Networks via Data-Efficient Deep Reinforcement Learning
    Wu, Di
    Kang, Jikun
    Xu, Yi Tian
    Li, Hang
    Li, Jimmy
    Chen, Xi
    Rivkin, Dmitriy
    Jenkin, Michael
    Lee, Taeseop
    Park, Intaik
    Liu, Xue
    Dudek, Gregory
    [J]. 2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [36] SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2019, (306): : 354 - 354
  • [37] Data-efficient deep reinforcement learning with expert demonstration for active flow control
    Zheng, Changdong
    Xie, Fangfang
    Ji, Tingwei
    Zhang, Xinshuai
    Lu, Yufeng
    Zhou, Hongjie
    Zheng, Yao
    [J]. PHYSICS OF FLUIDS, 2022, 34 (11)
  • [38] Uniform Priors for Data-Efficient Learning
    Sinha, Samarth
    Roth, Karsten
    Goyal, Anirudh
    Ghassemi, Marzyeh
    Akata, Zeynep
    Larochelle, Hugo
    Garg, Animesh
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4026 - 4037
  • [39] Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs
    McAllister, Rowan Thomas
    Rasmussen, Carl Edward
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [40] DATA-EFFICIENT DEEP REINFORCEMENT LEARNING WITH CONVOLUTION-BASED STATE ENCODER NETWORKS
    Fang, Qiang
    Xu, Xin
    Lan, Yixin
    Zhang, Yichuan
    Zeng, Yujun
    Tang, Tao
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2021, 36