IDENTIFYING EFFECTIVE POLICIES IN APPROXIMATE DYNAMIC PROGRAMMING: BEYOND REGRESSION

被引:0
|
作者
Maxwell, Matthew S. [1 ]
Henderson, Shane G. [1 ]
Topaloglu, Huseyin [1 ]
机构
[1] Cornell Univ, Dept Operat Res & Informat Engn, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/WSC.2010.5679084
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Dynamic programming formulations may be used to solve for optimal policies in Markov decision processes. Due to computational complexity dynamic programs must often be solved approximately. We consider the case of a tunable approximation architecture used in lieu of computing true value functions. The standard methodology advocates tuning the approximation architecture via sample path information and regression to get a good fit to the true value function. We provide an example which shows that this approach may unnecessarily lead to poorly performing policies and suggest direct search methods to find better performing value function approximations. We illustrate this concept with an application from ambulance redeployment.
引用
收藏
页码:1079 / 1087
页数:9
相关论文
共 50 条
  • [1] Approximate Dynamic Programming Using Support Vector Regression
    Bethke, Brett
    How, Jonathan P.
    Ozdaglar, Asuman
    [J]. 47TH IEEE CONFERENCE ON DECISION AND CONTROL, 2008 (CDC 2008), 2008, : 3811 - 3816
  • [2] Approximate Dynamic Programming for Military Medical Evacuation Dispatching Policies
    Jenkins, Phillip R.
    Robbins, Matthew J.
    Lunday, Brian J.
    [J]. INFORMS JOURNAL ON COMPUTING, 2021, 33 (01) : 2 - 26
  • [3] Identifying cost-effective dynamic policies to control epidemics
    Yaesoubi, Reza
    Cohen, Ted
    [J]. STATISTICS IN MEDICINE, 2016, 35 (28) : 5189 - 5209
  • [4] Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression
    Bethke, Brett
    How, Jonathan P.
    [J]. 2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, : 745 - +
  • [5] An approximate dynamic programming approach for comparing firing policies in a networked air defense environment
    Summers, Daniel S.
    Robbins, Matthew J.
    Lunday, Brian J.
    [J]. COMPUTERS & OPERATIONS RESEARCH, 2020, 117
  • [6] Perspectives of approximate dynamic programming
    Powell, Warren B.
    [J]. ANNALS OF OPERATIONS RESEARCH, 2016, 241 (1-2) : 319 - 356
  • [7] A Survey of Approximate Dynamic Programming
    Wang Lin
    Peng Hui
    Zhu Hua-yong
    Shen Lin-cheng
    [J]. 2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 396 - 399
  • [8] A LINEAR PROGRAMMING METHODOLOGY FOR APPROXIMATE DYNAMIC PROGRAMMING
    Diaz, Henry
    Sala, Antonio
    Armesto, Leopoldo
    [J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2020, 30 (02) : 363 - 375
  • [9] The linear programming approach to approximate dynamic programming
    De Farias, DP
    Van Roy, B
    [J]. OPERATIONS RESEARCH, 2003, 51 (06) : 850 - 865
  • [10] Approximate dynamic programming via linear programming
    de Farias, DP
    Van Roy, B
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 689 - 695