A goal-oriented reinforcement learning for optimal drug dosage control

被引:0
|
作者
Zhang, Qian [1 ]
Li, Tianhao [1 ]
Li, Dengfeng [1 ]
Lu, Wei [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Goal-oriented; Reinforcement learning; Hierarchical decision; Multi-agent; Drug dosage control; SEPTIC SHOCK; SEPSIS; MORTALITY; LEVEL; CARE;
D O I
10.1007/s10479-024-06029-x
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician's dosing decision is correct and efficient determines patient's life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient's mortality by 10.23%.
引用
收藏
页码:1403 / 1423
页数:21
相关论文
共 50 条
  • [21] Optimal Convergence Rates for Goal-Oriented FEM with Quadratic Goal Functional
    Becker, Roland
    Innerberger, Michael
    Praetorius, Dirk
    COMPUTATIONAL METHODS IN APPLIED MATHEMATICS, 2021, 21 (02) : 267 - 288
  • [22] LEARNING GOAL-ORIENTED MEASUREMENT OF ACHIEVEMENT IN GERMAN
    MATYSCAK, M
    MUTTERSPRACHE, 1984, 94 (5-6): : 385 - 395
  • [23] Investigation of mechanisms of goal-oriented adaptive control
    Burtsev, M.S.
    Gusarev, R.V.
    Red'ko, V.G.
    Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 2002, (06): : 55 - 62
  • [24] Goal-oriented design of domain control panels
    Ponsard, Christophe
    Balych, Nadiya
    Massonet, Philippe
    Vanderdonckt, Jean
    van Lamsweerde, Axel
    INTERACTIVE SYSTEMS: DESIGN, SPECIFICATION, AND VERIFICATION, 2006, 3941 : 249 - 260
  • [25] Investigation of mechanisms of goal-oriented adaptive control
    Burtsev, MS
    Gusarev, RV
    Red'ko, VG
    JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 2002, 41 (06) : 890 - 897
  • [26] Forced convection heat transfer control for cylinder via closed-loop continuous goal-oriented reinforcement learning
    Liu, Yangwei
    Wang, Feitong
    Zhao, Shihang
    Tang, Yumeng
    PHYSICS OF FLUIDS, 2024, 36 (11)
  • [27] Goal-oriented architecture for telescope control software
    Andolfato, L.
    Karban, R.
    SOFTWARE AND CYBERINFRASTRUCTURE FOR ASTRONOMY VI, 2020, 11452
  • [28] Goal-oriented, model-based drug regimens
    Jelliffe, RW
    Maire, P
    COMPUTERS IN BIOLOGY AND MEDICINE, 2001, 31 (03) : 145 - 146
  • [29] Goal-Oriented Navigation with Avoiding Obstacle based on Deep Reinforcement Learning in Continuous Action Space
    Hien, Pham Xuan
    Kim, Gon-Woo
    2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 8 - 11
  • [30] A Weakly Supervised Method for Topic Segmentation and Labeling in Goal-oriented Dialogues via Reinforcement Learning
    Takanobu, Ryuichi
    Huang, Minlie
    Zhao, Zhongzhou
    Li, Fenglin
    Chen, Haiqing
    Zhu, Xiaoyan
    Nie, Liqiang
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4403 - 4410