A goal-oriented reinforcement learning for optimal drug dosage control

被引:0
|
作者
Zhang, Qian [1 ]
Li, Tianhao [1 ]
Li, Dengfeng [1 ]
Lu, Wei [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Goal-oriented; Reinforcement learning; Hierarchical decision; Multi-agent; Drug dosage control; SEPTIC SHOCK; SEPSIS; MORTALITY; LEVEL; CARE;
D O I
10.1007/s10479-024-06029-x
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician's dosing decision is correct and efficient determines patient's life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient's mortality by 10.23%.
引用
收藏
页码:1403 / 1423
页数:21
相关论文
共 50 条
  • [31] Goal-oriented Knowledge Reuse via Curriculum Evolution for Reinforcement Learning-based Adaptation
    Li, Jialong
    Zhang, Mingyue
    Mao, Zhenyu
    Zhao, Haiyan
    Jin, Zhi
    Honiden, Shinichi
    Tei, Kenji
    2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 189 - 198
  • [32] Evolution is not goal-oriented
    Guthrie, R
    FUTURIST, 1998, 32 (02) : 4 - 4
  • [33] GOAL-ORIENTED ADAPTIVITY IN POINTWISE STATE CONSTRAINED OPTIMAL CONTROL OF PARTIAL DIFFERENTIAL EQUATIONS
    Hintermueller, Michael
    Hoppe, Ronald H. W.
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2010, 48 (08) : 5468 - 5487
  • [34] GOAL-ORIENTED REHABILITATION
    DILLON, TJ
    HAHN, HR
    JACKSON, RR
    ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 1975, 56 (12): : 534 - 535
  • [35] Goal-oriented results
    不详
    PHYSICS WORLD, 2022, 35 (09) : 3 - 3
  • [36] Goal-oriented science
    Guston, David H.
    ISSUES IN SCIENCE AND TECHNOLOGY, 2009, 26 (01) : 18 - 18
  • [37] GOAL-ORIENTED CONSENSUS
    GRANFIELD, DD
    JOURNAL OF LEGAL EDUCATION, 1967, 19 (04) : 379 - 402
  • [38] GOAL-ORIENTED ETHOLOGY
    FRASER, AF
    APPLIED ANIMAL ETHOLOGY, 1977, 3 (04): : 295 - 298
  • [39] Goal-oriented workflows
    Sedvalde, I
    Kirikova, M
    AUTOMATED SYSTEMS BASED ON HUMAN SKILL 2000: JOINT DESIGN OF TECHNOLOGY AND ORGANISATION, 2000, : 83 - 85
  • [40] Ambitious and goal-oriented
    Graf, Kerstin
    WOCHENBLATT FUR PAPIERFABRIKATION, 2024, 152 (03): : 3 - 3