Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

被引：0

作者：

Nie, Allen ^{[1
]}

Flet-Berliac, Yannis ^{[1
]}

Jordan, Deon R. ^{[1
]}

Steenbergen, William ^{[1
]}

Brunskill, Emma ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

关键词：

ERROR RATE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Offline reinforcement learning (RL) can be used to improve future performance by leveraging historical data. There exist many different algorithms for offline RL, and it is well recognized that these algorithms, and their hyperparameter settings, can lead to decision policies with substantially differing performance. This prompts the need for pipelines that allow practitioners to systematically perform algorithm-hyperparameter selection for their setting. Critically, in most real-world settings, this pipeline must only involve the use of historical data. Inspired by statistical model selection methods for supervised learning, we introduce a task- and methodagnostic pipeline for automatically training, comparing, selecting, and deploying the best policy when the provided dataset is limited in size. In particular, our work highlights the importance of performing multiple data splits to produce more reliable algorithm-hyperparameter selection. While this is a common approach in supervised learning, to our knowledge, this has not been discussed in detail in the offline RL setting. We show it can have substantial impacts when the dataset is small. Compared to alternate approaches, our proposed pipeline outputs higher-performing deployed policies from a broad range of offline policy learning algorithms and across various simulation domains in healthcare, education, and robotics. This work contributes toward the development of a general-purpose meta-algorithm for automatic algorithm-hyperparameter selection for offline RL.

引用

页数：14

共 50 条

[1] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
Angelotti, Giorgio
Drougard, Nicolas
Chanel, Caroline P. C.
[J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
[2] Data-Efficient Hierarchical Reinforcement Learning
Nachum, Ofir
Gu, Shixiang
Lee, Honglak
Levine, Sergey
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[3] PerSim: Data-efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators
Agarwal, Anish
Alomar, Abdullah
Alumootil, Varkey
Shah, Devavrat
Shen, Dennis
Xu, Zhi
Yang, Cindy
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[4] Data-Efficient Reinforcement Learning for Malaria Control
Zou, Lixin
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
[5] Pretraining Representations for Data-Efficient Reinforcement Learning
Schwarzer, Max
Rajkumar, Nitarshan
Noukhovitch, Michael
Anand, Ankesh
Charlin, Laurent
Hjelm, Devon
Bachman, Philip
Courville, Aaron
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[6] Efficient Online Reinforcement Learning with Offline Data
Ball, Philip J.
Smith, Laura
Kostrikov, Ilya
Levine, Sergey
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[7] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
Mondal, Arnab Kumar
Jain, Vineet
Siddiqi, Kaleem
Ravanbakhsh, Siamak
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[8] Data-Efficient Reinforcement Learning for Variable Impedance Control
Anand, Akhil S.
Kaushik, Rituraj
Gravdahl, Jan Tommy
Abu-Dakka, Fares J.
[J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
[9] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
Cagatan, Omer Veysel
Akgun, Baris
[J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
[10] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
Zhao, Dongfang
Liu, Jiafeng
Wu, Rui
Cheng, Dansong
Tang, Xianglong
[J]. IEEE ACCESS, 2019, 7 : 55763 - 55769

← 1 2 3 4 5 →