Sensitivity-based nested partitions for solving finite-horizon Markov decision processes

被引:1
|
作者
Chen, Weiwei [1 ]
机构
[1] Rutgers State Univ, Dept Supply Chain Management, 1 Washington Pk, Newark, NJ 07102 USA
关键词
Approximate dynamic programming; Markov decision processes; Nested partitions; Sensitivity-based approach;
D O I
10.1016/j.orl.2017.07.006
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper, we propose a heuristic for solving finite-horizon Markov decision processes. The heuristic uses the nested partitions (NP) framework to guide an iterative search for the optimal policy. NP focuses the search on certain promising subregions, flexibly determined by the sampling weight of each action branch. Within each subregion, an effective local policy optimization is developed using sensitivity-based approach, which optimizes the sampling weights based on estimated gradient information. Numerical results show the effectiveness of the proposed heuristic. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:481 / 487
页数:7
相关论文
共 50 条