A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources

被引:0
|
作者
Marecki, Janusz [1 ]
Koenig, Sven [1 ]
Tambe, Milind [1 ]
机构
[1] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability distributions, such as execution time or battery power. These planning problems can be modeled with continuous state Markov decision processes (MDPs) but existing solution methods are either inefficient or provide no guarantee on the quality of the resulting policy. We therefore present CPH, a novel solution method that solves the planning problems by first approximating with any desired accuracy the probability distributions over the resource consumptions with phase-type distributions, which use exponential distributions as building blocks. It then uses value iteration to solve the resulting MDPs by exploiting properties of exponential distributions to calculate the necessary convolutions accurately and efficiently while providing strong guarantees on the quality of the resulting policy. Our experimental feasibility study in a Mars rover domain demonstrates a substantial speedup over Lazy Approximation, which is currently the leading algorithm for solving continuous state MDPs with quality guarantees.
引用
收藏
页码:2536 / 2541
页数:6
相关论文
共 50 条
  • [21] An evolutionary random policy search algorithm for solving Markov decision processes
    Hu, Jiaqiao
    Fu, Michael C.
    Ramezani, Vahid R.
    Marcus, Steven I.
    INFORMS JOURNAL ON COMPUTING, 2007, 19 (02) : 161 - 174
  • [22] AN IMPROVED ALGORITHM FOR SOLVING COMMUNICATING AVERAGE REWARD MARKOV DECISION PROCESSES
    Haviv, Moshe
    Puterman, Martin L.
    ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) : 229 - 242
  • [23] Solving concurrent Markov decision processes
    Weld, M
    Weld, DS
    PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 716 - 722
  • [24] Solving hybrid Markov decision processes
    Reyes, Alberto
    Sucar, L. Enrique
    Morales, Eduardo F.
    Ibarguengoytia, Pablo H.
    MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 227 - +
  • [25] Improvement of real-valued genetic algorithm and performance study
    Control and Simulation Centre, Harbin Institute of Technology, Harbin 150001, China
    Tien Tzu Hsueh Pao, 2007, 2 (269-274): : 269 - 274
  • [26] Research on a randomized real-valued negative selection algorithm
    Dept. of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
    不详
    不详
    J. Harbin Inst. Technol., 2006, 6 (745-747):
  • [27] A Real-valued MUSIC Algorithm with Forward/Backward Technique
    Wang, Yan
    Si, Weijian
    Wang, Kun
    2017 INTERNATIONAL APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY SYMPOSIUM - ITALY (ACES), 2017,
  • [28] A Novel Real-Valued DOA Algorithm Based on Eigenvalue
    Yang, De-Sen
    Chen, Feng
    Mo, Shi-Qi
    SENSORS, 2020, 20 (01)
  • [29] Nonlinear mapping using real-valued genetic algorithm
    Chen, ZP
    Jiang, JH
    Li, Y
    Yu, RQ
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1999, 45 (1-2) : 409 - 418
  • [30] Exogenous parameter selection in a real-valued genetic algorithm
    Kaiser, CE
    Lamont, GB
    Merkle, LD
    Gates, GH
    Pachter, R
    PROCEEDINGS OF 1997 IEEE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION (ICEC '97), 1997, : 569 - 574