Linear Programming for Large-Scale Markov Decision Problems

被引:0
|
作者
Abbasi-Yadkori, Yasin [1 ]
Bartlett, Peter L. [1 ,2 ]
Malek, Alan [2 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld 4000, Australia
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
基金
澳大利亚研究理事会;
关键词
RANDOMIZED SOLUTIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest goal of competing with a low-dimensional family of policies. We use the dual linear programming formulation of the MDP average cost problem, in which the variable is a stationary distribution over state-action pairs, and we consider a neighborhood of a low-dimensional subset of the set of stationary distributions (defined in terms of state-action features) as the comparison class. We propose a technique based on stochastic convex optimization and give bounds that show that the performance of our algorithm approaches the best achievable by any policy in the comparison class. Most importantly, this result depends on the size of the comparison class, but not on the size of the state space. Preliminary experiments show the effectiveness of the proposed algorithm in a queuing application.
引用
收藏
页码:496 / 504
页数:9
相关论文
共 50 条
  • [1] Solution method for large-scale linear programming problems
    Golikov, AI
    Evtushenko, YG
    [J]. DOKLADY MATHEMATICS, 2004, 70 (01) : 615 - 619
  • [2] APPROXIMATIVE SOLUTION OF LARGE-SCALE LINEAR PROGRAMMING PROBLEMS
    FORGO, F
    SZEP, J
    [J]. ECONOMETRICA, 1970, 38 (04) : 49 - &
  • [3] Augmented Lagrangian method for large-scale linear programming problems
    Evtushenko, YG
    Golikov, AI
    Mollaverdy, N
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2005, 20 (4-5): : 515 - 524
  • [4] SEQUENTIAL-ANALYSIS IN LARGE-SCALE PROBLEMS IN LINEAR-PROGRAMMING
    MIKHALEVICH, VS
    VOLKOVICH, VL
    VOLOSHIN, AF
    [J]. CYBERNETICS, 1981, 17 (04): : 548 - 556
  • [5] Fuzzy programming for large-scale multiobjective linear programming problems with block angular structure
    Sakawa, Masatoshi
    Inuiguchi, Masahiro
    Sawada, Kazuya
    [J]. Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi), 1994, 77 (11): : 22 - 32
  • [6] A LINEAR-PROGRAMMING APPROACH TO LARGE-SCALE LINEAR OPTIMAL-CONTROL PROBLEMS
    BANOS, JCM
    PAPAGEORGIOU, M
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1995, 40 (05) : 971 - 977
  • [7] SOLVING LARGE-SCALE ZERO-ONE LINEAR-PROGRAMMING PROBLEMS
    CROWDER, H
    JOHNSON, EL
    PADBERG, M
    [J]. OPERATIONS RESEARCH, 1983, 31 (05) : 803 - 834
  • [8] COMPUTATIONAL SCHEMES FOR LARGE-SCALE PROBLEMS IN EXTENDED LINEAR-QUADRATIC PROGRAMMING
    ROCKAFELLAR, RT
    [J]. MATHEMATICAL PROGRAMMING, 1990, 48 (03) : 447 - 474
  • [9] DECOMPOSITION METHOD IN SOLUTION OF LARGE-SCALE LINEAR PROGRAMMING PROBLEMS WITH BLOCK STRUCTURE
    MALINNIK.VV
    [J]. MATEKON, 1973, 9 (03): : 41 - 46
  • [10] Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
    Abbasi-Yadkori, Yasin
    Bartlett, Peter L.
    Chen, Xi
    Malek, Alan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1053 - 1062