Markov Decision Processes

被引:1
|
作者
Bäuerle N. [1 ]
Rieder U. [2 ]
机构
[1] Institute for Stochastics, Karlsruhe Institute of Technology, Karlsruhe
[2] Department of Optimization and Operations Research, University of Ulm, Ulm
关键词
Bellman equation; Linear programming; Markov chain; Markov decision process; Policy improvement;
D O I
10.1365/s13291-010-0007-2
中图分类号
学科分类号
摘要
The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950’s. During the decades of the last century this theory has grown dramatically. It has found applications in various areas like e.g. computer science, engineering, operations research, biology and economics. In this article we give a short introduction to parts of this theory. We treat Markov Decision Processes with finite and infinite time horizon where we will restrict the presentation to the so-called (generalized) negative case. Solution algorithms like Howard’s policy improvement and linear programming are also explained. Various examples show the application of the theory. We treat stochastic linear-quadratic control problems, bandit problems and dividend pay-out problems. © 2010, Vieweg+Teubner und Deutsche Mathematiker-Vereinigung.
引用
收藏
页码:217 / 243
页数:26
相关论文
共 50 条
  • [1] Online Markov Decision Processes
    Even-Dar, Eyal
    Kakade, Sham M.
    Mansour, Yishay
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 726 - 736
  • [2] MARKOV DECISION-PROCESSES
    SCHAL, M
    [J]. STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1984, 17 (01) : 13 - 13
  • [3] A review on Markov Decision Processes
    J. A. Filar and LIU Ke Centre for Industrial and Applicable Mathematics
    Institute of Applied Mathematics
    [J]. Science Bulletin, 1999, (07) : 672 - 672
  • [4] MARKOV DECISION-PROCESSES
    WHITE, CC
    WHITE, DJ
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1989, 39 (01) : 1 - 16
  • [5] On constrained Markov decision processes
    Haviv, M
    [J]. OPERATIONS RESEARCH LETTERS, 1996, 19 (01) : 25 - 28
  • [6] Algebraic Markov Decision Processes
    Perny, Patrice
    Spanjaard, Olivier
    Weng, Paul
    [J]. 19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1372 - 1377
  • [7] Feature Markov Decision Processes
    Hutter, Marcus
    [J]. ARTIFICIAL GENERAL INTELLIGENCE PROCEEDINGS, 2009, 8 : 61 - 66
  • [8] Characterizing Markov decision processes
    Ratitch, B
    Precup, D
    [J]. MACHINE LEARNING: ECML 2002, 2002, 2430 : 391 - 404
  • [9] Absorbing Markov decision processes
    Dufour, Francois
    Prieto-Rumeau, Tomas
    [J]. ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2024, 30
  • [10] Configurable Markov Decision Processes
    Metelli, Alberto Maria
    Mutti, Mirco
    Restelli, Marcello
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80