A SURVEY OF ALGORITHMIC METHODS FOR PARTIALLY OBSERVED MARKOV DECISION PROCESSES

被引:296
|
作者
Lovejoy, William S. [1 ]
机构
[1] Stanford Univ, Grad Sch Business, Stanford, CA 94305 USA
关键词
D O I
10.1007/BF02055574
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
A partially observed Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. The significant applied potential for such processes remains largely unrealized, due to an historical lack of tractable solution methodologies. This paper reviews some of the current algorithmic alternatives for solving discrete-time, finite POMDPs over both finite and infinite horizons. The major impediment to exact solution is that, even with a finite set of internal system states, the set of possible information states is uncountably infinite. Finite algorithms are theoretically available for exact solution of the finite horizon problem, but these are computationally intractable for even modest-sized problems. Several approximation methodoiogies are reviewed that have the potential to generate computationally feasible, high precision solutions.
引用
收藏
页码:47 / 65
页数:19
相关论文
共 50 条