NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

被引:0
|
作者
Qin, Rong-Jun [1 ,2 ]
Zhang, Xingyuan [2 ]
Gao, Songyi [2 ]
Chen, Xiong-Hui [1 ,2 ]
Li, Zewen [2 ]
Zhang, Weinan [3 ]
Yu, Yang [1 ,2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Polixir Technol, Nanjing, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline reinforcement learning (RL) aims at learning effective policies from historical data without extra environment interactions. During our experience of applying offline RL, we noticed that previous offline RL benchmarks commonly involve significant reality gaps, which we have identified include rich and overly exploratory datasets, degraded baseline, and missing policy validation. In many real-world situations, to ensure system safety, running an overly exploratory policy to collect various data is prohibited, thus only a narrow data distribution is available. The resulting policy is regarded as effective if it is better than the working behavior policy; the policy model can be deployed only if it has been well validated, rather than accomplished the training. In this paper, we present a Near real-world offline RL benchmark, named NeoRL, to reflect these properties. NeoRL datasets are collected with a more conservative strategy. Moreover, NeoRL contains the offline training and offline validation pipeline before the online test, corresponding to real-world situations. We then evaluate recent state-of-the-art offline RL algorithms on NeoRL. The empirical results demonstrate that some offline RL algorithms are less competitive to the behavior cloning and the deterministic behavior policy, implying that they may be less effective in real-world tasks than in the previous benchmarks. We also disclose that current offline policy evaluation methods could hardly select the best policy. We hope this work will shed some light on future research and deploying RL in real-world systems.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Real-world Robot Reaching Skill Learning Based on Deep Reinforcement Learning
    Liu, Naijun
    Lu, Tao
    Cai, Yinghao
    Wang, Rui
    Wang, Shuo
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4780 - 4784
  • [22] A Real-World Benchmark Problem for Global Optimization
    Yuriy, Romasevych
    Viatcheslav, Loveikin
    Borys, Bakay
    [J]. CYBERNETICS AND INFORMATION TECHNOLOGIES, 2023, 23 (03) : 23 - 39
  • [23] Real-world dexterous object manipulation based deep reinforcement learning
    Yao, Qingfeng
    Wang, Jilong
    Yang, Shuyu
    [J]. arXiv, 2021,
  • [24] Towards Real-World Deployment of Reinforcement Learning for Traffic Signal Control
    Mueller, Arthur
    Rangras, Vishal
    Ferfers, Tobias
    Hufen, Florian
    Schreckenberg, Lukas
    Jasperneite, Juergen
    Schnittker, Georg
    Waldmann, Michael
    Friesen, Maxim
    Wiering, Marco
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 507 - 514
  • [25] Tackling Real-World Autonomous Driving using Deep Reinforcement Learning
    Maramotti, Paolo
    Capasso, Alessandro Paolo
    Bacchiani, Giulio
    Broggi, Alberto
    [J]. 2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1274 - 1281
  • [26] Simulation-Based Reinforcement Learning for Real-World Autonomous Driving
    Osinski, Blazej
    Jakubowski, Adam
    Ziecina, Pawel
    Milos, Piotr
    Galias, Christopher
    Homoceanu, Silviu
    Michalewski, Henryk
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 6411 - 6418
  • [27] Real-Sim-Real Transfer for Real-World Robot Control Policy Learning with Deep Reinforcement Learning
    Liu, Naijun
    Cai, Yinghao
    Lu, Tao
    Wang, Rui
    Wang, Shuo
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (05):
  • [28] Train Offline, Test Online: A Real Robot Learning Benchmark
    Zhou, Gaoyue
    Dean, Victoria
    Srirama, Mohan Kumar
    Rajeswaran, Aravind
    Pari, Jyothish
    Hatch, Kyle
    Jain, Aryan
    Yu, Tianhe
    Abbeel, Pieter
    Pinto, Lend
    Finn, Chelsea
    Gupta, Abhinav
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9197 - 9203
  • [29] Intrinsically motivated reinforcement learning for human-robot interaction in the real-world
    Qureshi, Ahmed Hussain
    Nakamura, Yutaka
    Yoshikawa, Yuichiro
    Ishiguro, Hiroshi
    [J]. NEURAL NETWORKS, 2018, 107 : 23 - 33
  • [30] Non-blocking Asynchronous Training for Reinforcement Learning in Real-World Environments
    Bohm, Peter
    Pounds, Pauline
    Chapman, Archie C.
    [J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 10927 - 10934