Simulation-based optimization of Markov reward processes

被引:0
|
作者
Marbach, P [1 ]
Tsitsiklis, JN [1 ]
机构
[1] MIT, Informat & Decis Syst Lab, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a simulation-based algorithm for optimizing the average reward in a Markov Reward Process that depends on a set of parameters. As a special case, the method applies to Markov Decision Processes where optimization takes place within a parametrized set of policies. The algorithm involves the simulation of a single sample path, and can be implemented on-line. A convergence result (with probability 1) is provided.
引用
收藏
页码:2698 / 2703
页数:6
相关论文
共 50 条
  • [21] Simulation-based optimization of an agent-based simulation
    Deckert, Andreas
    Klein, Robert
    NETNOMICS, 2014, 15 (01): : 33 - 56
  • [22] SIMULATION-BASED MULTIOBJECTIVE OPTIMIZATION OF BRIDGE CONSTRUCTION PROCESSES USING PARALLEL COMPUTING
    Salimi, Shide
    Mawlana, Mohammed
    Hammad, Amin
    PROCEEDINGS OF THE 2014 WINTER SIMULATION CONFERENCE (WSC), 2014, : 3272 - 3283
  • [23] Simulation-based optimization of distillation processes using an extended cutting plane algorithm
    Javaloyes-Anton, Juan
    Kronqvist, Jan
    Caballero, Jose A.
    COMPUTERS & CHEMICAL ENGINEERING, 2022, 159
  • [24] A two-timescale simulation-based gradient algorithm for weighted cost Markov Decision Processes
    He, Ying
    Fu, Michael C.
    Marcus, Steven I.
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 8022 - 8027
  • [25] About distributed simulation-based optimization of forming processes using a grid architecture
    Grauer, M
    Barth, T
    MATERIALS PROCESSING AND DESIGN: MODELING, SIMULATION AND APPLICATIONS, PTS 1 AND 2, 2004, 712 : 2097 - 2102
  • [26] Simulation-Based Optimization of Chemical Processes Using the Extended Cutting Plane Algorithm
    Javaloyes-Anton, Juan
    Kronqvist, Jan
    Caballero, Jose A.
    28TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING, 2018, 43 : 463 - 469
  • [27] Markov reward models and markov decision processes in discrete and continuous time: Performance evaluation and optimization
    Gouberman, Alexander
    Siegle, Markus
    Gouberman, Alexander (alexander.gouberman@unibw.de), 1600, Springer Verlag (8453): : 156 - 241
  • [28] Markov Decision Processes with Arbitrary Reward Processes
    Yu, Jia Yuan
    Mannor, Shie
    Shimkin, Nahum
    RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 268 - +
  • [29] Markov Decision Processes with Arbitrary Reward Processes
    Yu, Jia Yuan
    Mannor, Shie
    Shimkin, Nahum
    MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 737 - 757
  • [30] Simulation-Based Optimization for Steel Stacking
    Rei, Rui Jorge
    Kubo, Mikio
    Pedroso, Joao Pedro
    MODELLING, COMPUTATION AND OPTIMIZATION IN INFORMATION SYSTEMS AND MANAGEMENT SCIENCES, PROCEEDINGS, 2008, 14 : 254 - +