Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications

被引:4
|
作者
Nambiar, Mila [1 ]
Ghosh, Supriyo [1 ,2 ]
Ong, Priscilla [1 ]
Chan, Yu En [1 ]
Bee, Yong Mong [3 ]
Krishnaswamy, Pavitra [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
[2] Microsoft, Bengaluru, Karnataka, India
[3] Singapore Gen Hosp, Dept Endocrinol, Singapore, Singapore
关键词
Offline reinforcement learning; treatment optimization; sepsis treatment; type 2 diabetes treatment; sampling; safety constraints;
D O I
10.1145/3580305.3599800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is increasing interest in data-driven approaches for recommending optimal treatment strategies in many chronic disease management and critical care applications. Reinforcement learning methods are well-suited to this sequential decision-making problem, but must be trained and evaluated exclusively on retrospective medical record datasets as direct online exploration is unsafe and infeasible. Despite this requirement, the vast majority of treatment optimization studies use off-policy RL methods (e.g., Double Deep Q Networks (DDQN) or its variants) that are known to perform poorly in purely offline settings. Recent advances in offline RL, such as Conservative Q-Learning (CQL), offer a suitable alternative. But there remain challenges in adapting these approaches to real-world applications where suboptimal examples dominate the retrospective dataset and strict safety constraints need to be satisfied. In this work, we introduce a practical and theoretically grounded transition sampling approach to address action imbalance during offline RL training. We perform extensive experiments on two real-world tasks for diabetes and sepsis treatment optimization to compare performance of the proposed approach against prominent off-policy and offline RL baselines (DDQN and CQL). Across a range of principled and clinically relevant metrics, we show that our proposed approach enables substantial improvements in expected health outcomes and in consistency with relevant practice and safety guidelines.
引用
收藏
页码:4673 / 4684
页数:12
相关论文
共 50 条
  • [1] Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning
    Jin, Jun
    Graves, Daniel
    Haigh, Cameron
    Luo, Jun
    Jagersand, Martin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 3616 - 3623
  • [2] NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
    Qin, Rong-Jun
    Zhang, Xingyuan
    Gao, Songyi
    Chen, Xiong-Hui
    Li, Zewen
    Zhang, Weinan
    Yu, Yang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Exploring Applications of Deep Reinforcement Learning for Real-world Autonomous Driving Systems
    Talpaert, Victor
    Sobh, Ibrahim
    Kiran, B. Ravi
    Mannion, Patrick
    Yogamani, Senthil
    El-Sallab, Ahmad
    Perez, Patrick
    [J]. PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 564 - 572
  • [4] Reinforcement Learning in Robotics: Applications and Real-World Challenges
    Kormushev, Petar
    Calinon, Sylvain
    Caldwell, Darwin G.
    [J]. ROBOTICS, 2013, 2 (03): : 122 - 148
  • [5] Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning
    Lutter, Michael
    Silberbauer, Johannes
    Watson, Joe
    Peters, Jan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4163 - 4170
  • [6] Real-world Robot Reaching Skill Learning Based on Deep Reinforcement Learning
    Liu, Naijun
    Lu, Tao
    Cai, Yinghao
    Wang, Rui
    Wang, Shuo
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4780 - 4784
  • [7] Tackling Real-World Autonomous Driving using Deep Reinforcement Learning
    Maramotti, Paolo
    Capasso, Alessandro Paolo
    Bacchiani, Giulio
    Broggi, Alberto
    [J]. 2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1274 - 1281
  • [8] Real-world dexterous object manipulation based deep reinforcement learning
    Yao, Qingfeng
    Wang, Jilong
    Yang, Shuyu
    [J]. arXiv, 2021,
  • [9] Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
    Kreutzer, Julia
    Riezler, Stefan
    Lawrence, Carolin
    [J]. SPNLP 2021: THE 5TH WORKSHOP ON STRUCTURED PREDICTION FOR NLP, 2021, : 37 - 43
  • [10] OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World
    Tu-Hoa Pham
    De Magistris, Giovanni
    Tachibana, Ryuki
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 6236 - 6243