Learning Parameterized Prescription Policies and Disease Progression Dynamics using Markov Decision Processes

被引:0
|
作者
Zhu, Henghui [1 ]
Xu, Tingting [1 ]
Paschalidis, Ioannis Ch [2 ,3 ]
机构
[1] Boston Univ, Ctr Informat & Syst Engn, Boston, MA 02215 USA
[2] Boston Univ, Dept Elect & Comp Engn, Div Syst Engn, 8 St Marys St, Boston, MA 02215 USA
[3] Boston Univ, Dept Biomed Engn, 8 St Marys St, Boston, MA 02215 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We develop an algorithm for learning physicians' prescription policies and the disease progression dynamics from Electronic Health Record (EHR) data. The prescription protocol used by physicians is viewed as a control policy which is a function of an underlying disease state in a Markov Decision Process (MDP) framework. We assume that the transition probabilities and the policy of the MDP are parameterized using some known features, such that only a small portion of them are informative. Two l(1)-regularized maximum likelihood estimation problems are formulated to learn the transition probabilities and the policy, respectively. A bound is established on the difference between the average reward of the estimated policy under the estimated transition dynamics and the original (unknown) policy under the true transition dynamics. Our result suggests that by using only a relatively small number of training samples, the estimate can achieve a low regret. We validate our theoretical results on a test MDP motivated by a disease treatment identification application.
引用
收藏
页码:3438 / 3443
页数:6
相关论文
共 50 条
  • [1] Learning Parameterized Policies for Markov Decision Processes through Demonstrations
    Hanawal, Manjesh K.
    Liu, Hao
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    [J]. 2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 7087 - 7092
  • [2] Learning Policies for Markov Decision Processes in Continuous Spaces
    Paternain, Santiago
    Bazerque, Juan Andres
    Small, Austin
    Ribeiro, Alejandro
    [J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 4751 - 4758
  • [3] Learning Policies for Markov Decision Processes From Data
    Hanawal, Manjesh Kumar
    Liu, Hao
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (06) : 2298 - 2309
  • [4] Variance minimization of parameterized Markov decision processes
    Li Xia
    [J]. Discrete Event Dynamic Systems, 2018, 28 : 63 - 81
  • [5] Variance minimization of parameterized Markov decision processes
    Xia, Li
    [J]. DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2018, 28 (01): : 63 - 81
  • [6] Learning deterministic policies in partially observable Markov decision processes
    Miyazaki, K
    Kobayashi, S
    [J]. INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
  • [7] On Markov policies for minimax decision processes
    Iwamoto, S
    Tsurusaki, K
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2001, 253 (01) : 58 - 78
  • [8] Parameterized Penalties in the Dual Representation of Markov Decision Processes
    Ye, Fan
    Zhou, Enlu
    [J]. 2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 870 - 876
  • [9] Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes
    Brazdil, Tomas
    Chatterjee, Krishnendu
    Novotny, Petr
    Vahala, Jiri
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9794 - 9801
  • [10] Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes
    Roy, Arghyadip
    Borkar, Vivek
    Karandikar, Abhay
    Chaporkar, Prasanna
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (07) : 3722 - 3729