Efficient Algorithms for Budget-Constrained Markov Decision Processes

被引:3
|
作者
Caramanis, Constantine [1 ]
Dimitrov, Nedialko B. [2 ]
Morton, David P. [3 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Naval Postgrad Sch, Dept Operat Res, Monterey, CA 93943 USA
[3] Univ Texas Austin, Grad Program Operat Res, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Markov decision processes (MDPs);
D O I
10.1109/TAC.2014.2314211
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discounted, discrete-time, discrete state-space, discrete action-space Markov decision processes (MDPs) form a classical topic in control, game theory, and learning, and as a result are widely applied, increasingly, in very large-scale applications. Many algorithms have been developed to solve large-scale MDPs. Algorithms based on value iteration are particularly popular, as they are more efficient than the generic linear programming approach, by an order of magnitude in the number of states of the MDP. Yet in the case of budget constrained MDPs, no more efficient algorithm than linear programming is known. The theoretically slower running times of linear programming may limit the scalability of constrained MDPs piratically; while, theoretically, it invites the question of whether the increase is somehow intrinsic. In this technical note we show that it is not, and provide two algorithms for budget-constrained MDPs that are as efficient as value iteration. Denoting the running time of value iteration by VI, and the magnitude of the input by U, for an MDP with m expected budget constraints our first algorithm runs in time O(poly(m, log U).VI). Given a pre-specified degree of precision,., for satisfying the budget constraints, our second algorithm runs in time O(logm center dot poly(log U).(1/eta(2)) center dot VI), but may produce solutions that overutilize each of the m budgets by a multiplicative factor of 1 + eta. In fact, one can substitute value iteration with any algorithm, possibly specially designed for a specific MDP, that solves the MDP quickly to achieve similar theoretical guarantees. Both algorithms restrict attention to constrained infinite-horizon MDPs under discounted costs.
引用
收藏
页码:2813 / 2817
页数:5
相关论文
共 50 条
  • [1] Approximation algorithms for budget-constrained auctions
    Garg, R
    Kumar, V
    Pandit, V
    [J]. APPROXIMATION, RANDOMIZATION, AND COMBINATORIAL OPTIMIZATION: ALGORITHMS AND TECHNIQUES, 2001, 2129 : 102 - 113
  • [2] Algorithms for budget-constrained survivable topology design
    Garg, N
    Simha, R
    Xing, WX
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2002, : 2162 - 2166
  • [3] Efficient algorithms for Risk-Sensitive Markov Decision Processes with limited budget
    Melo Moreira, Daniel A.
    Delgado, Karina Valdivia
    de Barros, Leliane Nunes
    Maua, Denis Deratani
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 139 : 143 - 165
  • [4] BUDGET-CONSTRAINED PARETO-EFFICIENT ALLOCATIONS
    BALASKO, Y
    [J]. JOURNAL OF ECONOMIC THEORY, 1979, 21 (03) : 359 - 379
  • [5] THE EXISTENCE OF BUDGET-CONSTRAINED PARETO-EFFICIENT ALLOCATIONS
    SVENSSON, LG
    [J]. JOURNAL OF ECONOMIC THEORY, 1984, 32 (02) : 346 - 350
  • [6] Budget-constrained search
    Manning, R
    Manning, JRA
    [J]. EUROPEAN ECONOMIC REVIEW, 1997, 41 (09) : 1817 - 1834
  • [7] BUDGET-CONSTRAINED PARETO EFFICIENT ALLOCATIONS - A DYNAMIC STORY
    BALASKO, Y
    [J]. JOURNAL OF ECONOMIC THEORY, 1982, 27 (01) : 239 - 242
  • [8] Constrained Multiagent Markov Decision Processes: a Taxonomy of Problems and Algorithms
    de Nijs, Frits
    Walraven, Erwin
    de Weerdt, Mathijs M.
    Spaan, Matthijs T. J.
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 70 : 955 - 1001
  • [9] Learning algorithms for finite horizon constrained markov decision processes
    Mittal, A.
    Hemachandra, N.
    [J]. JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2007, 3 (03) : 429 - 444
  • [10] Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms
    de Nijs, Frits
    Walraven, Erwin
    de Weerdt, Mathijs M.
    Spaan, Matthijs T.J.
    [J]. Journal of Artificial Intelligence Research, 2021, 70 : 955 - 1001