Simulation-based optimization of Markov reward processes

被引:0
|
作者
Marbach, P [1 ]
Tsitsiklis, JN [1 ]
机构
[1] MIT, Informat & Decis Syst Lab, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a simulation-based algorithm for optimizing the average reward in a Markov Reward Process that depends on a set of parameters. As a special case, the method applies to Markov Decision Processes where optimization takes place within a parametrized set of policies. The algorithm involves the simulation of a single sample path, and can be implemented on-line. A convergence result (with probability 1) is provided.
引用
收藏
页码:2698 / 2703
页数:6
相关论文
共 50 条
  • [1] Simulation-based optimization of Markov reward processes
    Marbach, Peter
    Tsitsiklis, John N.
    [J]. Proceedings of the IEEE Conference on Decision and Control, 1998, 3 : 2698 - 2703
  • [2] Simulation-based optimization of Markov reward processes
    Marbach, P
    Tsitsiklis, JN
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2001, 46 (02) : 191 - 209
  • [3] Simulation-based optimization of singularly perturbed Markov reward processes with states aggregation
    Zhang, DL
    Xi, HS
    Yin, BQ
    [J]. ADVANCES IN INTELLIGENT COMPUTING, PT 2, PROCEEDINGS, 2005, 3645 : 129 - 138
  • [4] PAC bounds for simulation-based optimization of Markov decision processes
    Jain, Rahul
    Varaiya, Pravin P.
    [J]. PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2007, : 6389 - +
  • [5] SIMULATION-BASED OPTIMIZATION OF MARKOV CONTROLLED PROCESSES WITH UNKNOWN PARAMETERS
    Campos-Nanez, Enrique
    [J]. 23RD EUROPEAN CONFERENCE ON MODELLING AND SIMULATION (ECMS 2009), 2009, : 537 - 543
  • [6] Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes
    Bhatnagar, Shalabh
    Abdulla, Mohammed Shahid
    [J]. SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2008, 84 (12): : 577 - 600
  • [7] Simulation-based optimization of Markov decision processes: An empirical process theory approach
    Jain, Rahul
    Varaiya, Pravin
    [J]. AUTOMATICA, 2010, 46 (08) : 1297 - 1304
  • [8] Adaptive optimization of Markov reward processes
    Campos-Nanez, Enrique
    Patek, Stephen D.
    [J]. 2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 8034 - 8041
  • [9] Distributed optimization of Markov reward processes
    Campos-Nane, Enrique
    [J]. PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2007, : 3921 - 3926
  • [10] A SURVEY OF SOME SIMULATION-BASED ALGORITHMS FOR MARKOV DECISION PROCESSES
    Chang, Hyeong Soo
    Fu, Michael C.
    Hu, Jiaqiao
    Marcus, Steven I.
    [J]. COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2007, 7 (01) : 59 - 92