Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning

被引：1

作者：

Zhao, Dongfang ^{[1
]}

Liu, Jiafeng ^{[1
]}

Wu, Rui ^{[1
]}

Cheng, Dansong ^{[1
]}

Tang, Xianglong ^{[1
]}

机构：

[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

美国国家科学基金会;

关键词：

Reinforcement learning; information entropy; optimistic sampling; data efficiency;

D O I：

10.1109/ACCESS.2019.2913001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A high required number of interactions with the environment is one of the most important problems in reinforcement learning (RL). To deal with this problem, several data-efficient RL algorithms have been proposed and successfully applied in practice. Unlike previous research, that focuses on optimal policy evaluation and policy improvement stages, we actively select informative samples by leveraging entropy-based optimal sampling strategy, which takes the initial samples set into consideration. During the initial sampling process, information entropy is used to describe the potential samples. The agent selects the most informative samples using an optimization method. This way, the initial sample is more informative than in random and fixed strategy. Therefore, a more accurate initial dynamic model and policy can be learned. Thus, the proposed optimal sampling method guides the agent to search in a more informative region. The experimental results on standard benchmark problems involving a pendulum, cart pole, and cart double pendulum show that our optimal sampling strategy has a better performance in terms of data efficiency.

引用

页码：55763 / 55769

页数：7

共 50 条

[41] Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs
McAllister, Rowan Thomas
Rasmussen, Carl Edward
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[42] Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection
Elaziz, Eman Abd
Fathalla, Radwa
Shaheen, Mohamed
[J]. JOURNAL OF BIG DATA, 2023, 10 (01)
[43] Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection
Eman Abd Elaziz
Radwa Fathalla
Mohamed Shaheen
[J]. Journal of Big Data, 10
[44] Data-efficient Learning of Morphology and Controller for a Microrobot
Liao, Thomas
Wang, Grant
Yang, Brian
Lee, Rene
Pister, Kristofer
Levine, Sergey
Calandra, Roberto
[J]. 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2488 - 2494
[45] Data-efficient performance learning for configurable systems
Jianmei Guo
Dingyu Yang
Norbert Siegmund
Sven Apel
Atrisha Sarkar
Pavel Valov
Krzysztof Czarnecki
Andrzej Wasowski
Huiqun Yu
[J]. Empirical Software Engineering, 2018, 23 : 1826 - 1867
[46] Data-Efficient Task Generalization via Probabilistic Model-Based Meta Reinforcement Learning
Bhardwaj, Arjun
Rothfuss, Jonas
Sukhija, Bhavya
As, Yarden
Hutter, Marco
Coros, Stelian
Krause, Andreas
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3918 - 3925
[47] Data-Efficient Deep Reinforcement Learning-Based Optimal Generation Control in DC Microgrids
Fan, Zhen
Zhang, Wei
Liu, Wenxin
[J]. IEEE SYSTEMS JOURNAL, 2024, 18 (01): : 426 - 437
[48] A data-efficient goal-directed deep reinforcement learning method for robot visuomotor skill
Jiang, Rong
Wang, Zhipeng
He, Bin
Zhou, Yanmin
Li, Gang
Zhu, Zhongpan
[J]. NEUROCOMPUTING, 2021, 462 (462) : 389 - 401
[49] Data-efficient performance learning for configurable systems
Guo, Jianmei
Yang, Dingyu
Siegmund, Norbert
Apel, Sven
Sarkar, Atrisha
Valov, Pavel
Czarnecki, Krzysztof
Wasowski, Andrzej
Yu, Huiqun
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (03) : 1826 - 1867
[50] Elliptic PDE learning is provably data-efficient
Boulle, Nicolas
Halikias, Diana
Townsend, Alex
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (39)

← 1 2 3 4 5 →