Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes

被引:8
|
作者
Bhatnagar, Shalabh [1 ]
Abdulla, Mohammed Shahid [2 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
[2] Gen Motors, India Sci Lab, Bangalore, Karnataka, India
关键词
Finite-horizon Markov decision processes; simulation-based algorithms; two-timescale stochastic approximation; function approximation; actor-critic algorithms; normalized Hadamard matrices;
D O I
10.1177/0037549708098120
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We develop four simulation-based algorithms for finite-horizon Markov decision processes. Two of these algorithms are developed for finite state and compact action spaces while the other two are for finite state and finite action spaces. Of the former two, one algorithm uses a linear parameterization for the policy, resulting in reduced memory complexity. Convergence analysis is briefly sketched and illustrative numerical experiments with the four algorithms are shown for a problem of flow control in communication networks.
引用
收藏
页码:577 / 600
页数:24
相关论文
共 50 条