Efficient sampling in approximate dynamic programming algorithms

被引:21
|
作者
Cervellera, Cristiano [1 ]
Muselli, Marco [1 ]
机构
[1] Ist Studi Sistemi Intelligenti Lautomaz, Consiglio Nazl Ric, I-16149 Genoa, Italy
关键词
stochastic optimal control problem; dynamic programming; sample complexity; deterministic learning; low-discrepancy sequences;
D O I
10.1007/s10589-007-9054-8
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how "fast" the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.
引用
收藏
页码:417 / 443
页数:27
相关论文
共 50 条
  • [31] Efficient algorithms of pathwise dynamic programming for decision optimization in mining operations
    Hinz, Juri
    Tarnopolskaya, Tanya
    Yee, Jeremy
    ANNALS OF OPERATIONS RESEARCH, 2020, 286 (1-2) : 583 - 615
  • [32] Efficient algorithms of pathwise dynamic programming for decision optimization in mining operations
    Juri Hinz
    Tanya Tarnopolskaya
    Jeremy Yee
    Annals of Operations Research, 2020, 286 : 583 - 615
  • [33] Sublinear time algorithms for approximate semidefinite programming
    Garber, Dan
    Hazan, Elad
    MATHEMATICAL PROGRAMMING, 2016, 158 (1-2) : 329 - 361
  • [34] Sublinear time algorithms for approximate semidefinite programming
    Dan Garber
    Elad Hazan
    Mathematical Programming, 2016, 158 : 329 - 361
  • [35] ALGORITHMS FOR APPROXIMATE SOLUTION OF THE BOOLEAN PROGRAMMING PROBLEM
    ALIYEV, AA
    IZVESTIYA AKADEMII NAUK AZERBAIDZHANSKOI SSR SERIYA FIZIKO-TEKHNICHESKIKH I MATEMATICHESKIKH NAUK, 1982, (02): : 111 - 116
  • [36] THE COMPLEXITY OF APPROXIMATE ALGORITHMS FOR THE PROBLEM OF INTEGER PROGRAMMING
    KUZYURIN, NN
    USSR COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 1984, 24 (01): : 100 - 103
  • [37] Approximate Dynamic Programming via Sum of Squares Programming
    Summers, Tyler H.
    Kunz, Konstantin
    Kariotoglou, Nikolaos
    Kamgarpour, Maryam
    Summers, Sean
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 191 - 197
  • [38] Efficient Semidefinite Programming with Approximate ADMM
    Rontsis, Nikitas
    Goulart, Paul
    Nakatsukasa, Yuji
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2022, 192 (01) : 292 - 320
  • [39] Efficient Semidefinite Programming with Approximate ADMM
    Nikitas Rontsis
    Paul Goulart
    Yuji Nakatsukasa
    Journal of Optimization Theory and Applications, 2022, 192 : 292 - 320
  • [40] Approximate dynamic programming with Gaussian processes
    Deisenroth, Marc P.
    Peters, Jan
    Rasmussen, Carl E.
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 4480 - +