Linear Programming for Large-Scale Markov Decision Problems

被引：0

作者：

Abbasi-Yadkori, Yasin ^{[1
]}

Bartlett, Peter L. ^{[1
,2
]}

Malek, Alan ^{[2
]}

机构：

[1] Queensland Univ Technol, Brisbane, Qld 4000, Australia

[2] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2) | 2014年 / 32卷

基金：

澳大利亚研究理事会;

关键词：

RANDOMIZED SOLUTIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest goal of competing with a low-dimensional family of policies. We use the dual linear programming formulation of the MDP average cost problem, in which the variable is a stationary distribution over state-action pairs, and we consider a neighborhood of a low-dimensional subset of the set of stationary distributions (defined in terms of state-action features) as the comparison class. We propose a technique based on stochastic convex optimization and give bounds that show that the performance of our algorithm approaches the best achievable by any policy in the comparison class. Most importantly, this result depends on the size of the comparison class, but not on the size of the state space. Preliminary experiments show the effectiveness of the proposed algorithm in a queuing application.

引用

页码：496 / 504

页数：9

共 50 条

[1] Solution method for large-scale linear programming problems
Golikov, AI
Evtushenko, YG
[J]. DOKLADY MATHEMATICS, 2004, 70 (01) : 615 - 619
[2] APPROXIMATIVE SOLUTION OF LARGE-SCALE LINEAR PROGRAMMING PROBLEMS
FORGO, F
SZEP, J
[J]. ECONOMETRICA, 1970, 38 (04) : 49 - &
[3] Augmented Lagrangian method for large-scale linear programming problems
Evtushenko, YG
Golikov, AI
Mollaverdy, N
[J]. OPTIMIZATION METHODS & SOFTWARE, 2005, 20 (4-5): : 515 - 524
[4] SEQUENTIAL-ANALYSIS IN LARGE-SCALE PROBLEMS IN LINEAR-PROGRAMMING
MIKHALEVICH, VS
VOLKOVICH, VL
VOLOSHIN, AF
[J]. CYBERNETICS, 1981, 17 (04): : 548 - 556
[5] Fuzzy programming for large-scale multiobjective linear programming problems with block angular structure
Sakawa, Masatoshi
Inuiguchi, Masahiro
Sawada, Kazuya
[J]. Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi), 1994, 77 (11): : 22 - 32
[6] A LINEAR-PROGRAMMING APPROACH TO LARGE-SCALE LINEAR OPTIMAL-CONTROL PROBLEMS
BANOS, JCM
PAPAGEORGIOU, M
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1995, 40 (05) : 971 - 977
[7] SOLVING LARGE-SCALE ZERO-ONE LINEAR-PROGRAMMING PROBLEMS
CROWDER, H
JOHNSON, EL
PADBERG, M
[J]. OPERATIONS RESEARCH, 1983, 31 (05) : 803 - 834
[8] COMPUTATIONAL SCHEMES FOR LARGE-SCALE PROBLEMS IN EXTENDED LINEAR-QUADRATIC PROGRAMMING
ROCKAFELLAR, RT
[J]. MATHEMATICAL PROGRAMMING, 1990, 48 (03) : 447 - 474
[9] DECOMPOSITION METHOD IN SOLUTION OF LARGE-SCALE LINEAR PROGRAMMING PROBLEMS WITH BLOCK STRUCTURE
MALINNIK.VV
[J]. MATEKON, 1973, 9 (03): : 41 - 46
[10] Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
Abbasi-Yadkori, Yasin
Bartlett, Peter L.
Chen, Xi
Malek, Alan
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1053 - 1062

← 1 2 3 4 5 →