Deep Spatial Q-Learning for Infectious Disease Control

被引:0
|
作者
Liu, Zhishuai [1 ]
Clifton, Jesse [2 ]
Laber, Eric B. [3 ]
Drake, John [4 ]
Fang, Ethan X. [5 ]
机构
[1] Duke Univ, Dept Stat Sci, Durham, NC USA
[2] NC State Univ, Dept Stat, Raleigh, NC USA
[3] Duke Univ, Dept Stat Sci, Dept Biostat & Bioinformat, Durham, NC 27708 USA
[4] Univ Georgia, Sch Ecol, Athens, GA USA
[5] Duke Univ, Dept Biostat & Bioinformat, Durham, NC USA
基金
美国国家科学基金会;
关键词
Infectious diseases; Reinforcement learning; Graph neural networks; DYNAMIC TREATMENT REGIMES; EBOLA-VIRUS DISEASE; CAUSAL INFERENCE; REGRESSION; MODELS; TIME; EPIDEMIC;
D O I
10.1007/s13253-023-00551-4
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Infectious diseases are a cause of humanitarian and economic crises across the world. In developing regions, a severe epidemic can result in the collapse of healthcare infrastructure or even the failure of an affected state. The most recent 2013-2015 outbreak of Ebola virus disease in West Africa is an example of such an epidemic. The economic, infrastructural, and human costs of this outbreak provide strong motivation for the examination of adaptive treatment strategies that allocate resources in response to and anticipation of the evolution of an epidemic. We formalize adaptive management of an emerging infectious disease spreading across a set of locations as a treatment regime that maps up-to-date information on the epidemic to a subset of locations identified as high-priority for treatment. An optimal treatment regime in this context is defined as maximizing the expectation of a pre-specified cumulative utility measure, e.g., the number of disease-free individuals or the estimated reduction in morbidity or mortality relative to a baseline intervention strategy. Because the disease dynamics are not known at the beginning of an outbreak, an optimal treatment regime must be estimated online, i.e., as data accumulate; thus, an effective estimation algorithm must balance choosing interventions that lead to information gain and thereby model improvement with interventions that appear to be optimal under the current estimated model. We develop a novel model-free algorithm for the online management of an infectious disease spreading over a finite set of locations and an indefinite or infinite time horizon. The proposed algorithm balances exploration and exploitation using a semi-parametric variant of Thompson sampling. We also introduce a graph neural network-based estimator in order to improve the performance of this class of algorithms. Simulations, including those mimicking the spread of the 2013-2015 Ebola outbreak, suggest that an adaptive treatment strategy has the potential to significantly reduce mortality relative to ad hoc management strategies.Supplementary materials accompanying this paper appear online.
引用
收藏
页码:749 / 773
页数:25
相关论文
共 50 条
  • [1] Deep Spatial Q-Learning for Infectious Disease Control
    Zhishuai Liu
    Jesse Clifton
    Eric B. Laber
    John Drake
    Ethan X. Fang
    [J]. Journal of Agricultural, Biological and Environmental Statistics, 2023, 28 : 749 - 773
  • [2] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [3] Deep Q-learning: A robust control approach
    Varga, Balazs
    Kulcsar, Balazs
    Chehreghani, Morteza Haghir
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (01) : 526 - 544
  • [4] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    [J]. FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [5] Faster Deep Q-learning using Neural Episodic Control
    Nishio, Daichi
    Yamane, Satoshi
    [J]. 2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 486 - 491
  • [6] Double Deep Q-Learning Based Irrigation and Chemigation Control
    Song, Jianfeng
    Porter, Dana
    Hu, Jiang
    Marek, Thomas
    [J]. PROCEEDINGS OF THE TWENTY THIRD INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2022), 2022, : 414 - 419
  • [7] Adaptive Traffic Signal Control with Deep Recurrent Q-learning
    Zeng, Jinghong
    Hu, Jianming
    Zhang, Yi
    [J]. 2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2018, : 1215 - 1220
  • [8] Deep Reinforcement Learning with Double Q-Learning
    van Hasselt, Hado
    Guez, Arthur
    Silver, David
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
  • [9] Comparison of Deep Q-Learning, Q-Learning and SARSA Reinforced Learning for Robot Local Navigation
    Anas, Hafiq
    Ong, Wee Hong
    Malik, Owais Ahmed
    [J]. ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 443 - 454
  • [10] Active deep Q-learning with demonstration
    Chen, Si-An
    Tangkaratt, Voot
    Lin, Hsuan-Tien
    Sugiyama, Masashi
    [J]. MACHINE LEARNING, 2020, 109 (9-10) : 1699 - 1725