Stochastic Optimal Control using Local Sample-based Value Function Approximation

被引:0
|
作者
Dolgov, Maxim [1 ]
Kurz, Gerhard [2 ]
Grimm, Daniela [2 ]
Rosenthal, Florian [2 ]
Hanebeck, Uwe D. [2 ]
机构
[1] Robert Bosch GmbH, Corp Res, Stuttgart, Germany
[2] Karlsruhe Inst Technol KIT, Inst Anthropomat & Robot, Intelligent Sensor Actuator Syst Lab ISAS, Karlsruhe, Germany
关键词
OBSERVABLE MARKOV-PROCESSES; VALUE-ITERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In stochastic optimal control and partially-observable Markov decision processes, trajectory optimization methods iteratively deform a reference trajectory in a space of probability distributions such that the performance criterion associated with the problem attains an optimum. Related state-of-the-art trajectory optimization approaches are restricted to the space of Gaussian probability distributions where during optimization they perform second-order Taylor expansion of the value function at the parameters of the Gaussian, i.e. the mean and the covariance. In this paper, we propose a novel approach where trajectory optimization is performed in the space of Dirac distributions and the Taylor expansion of the value function is done at the positions of its samples. By doing so, we are able to deal with non-Gaussian distributions because Dirac distributions are often used to approximate arbitrary probability distributions. The proposed approach is demonstrated in a simulation.
引用
收藏
页码:2145 / 2150
页数:6
相关论文
共 50 条
  • [1] Sample-Based Information-Theoretic Stochastic Optimal Control
    Lioutikov, Rudolf
    Paraschos, Alexandros
    Peters, Jan
    Neumann, Gerhard
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 3896 - 3902
  • [2] Sample-Based Potentials Estimation for the Optimal Control of Stochastic System
    Cheng Kang
    Zhang Kanjian
    Fei Shumin
    Liu Xiao-Mei
    [J]. 2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 2031 - 2035
  • [3] Provably Near-Optimal Approximation Schemes for Implicit Stochastic and Sample-Based Dynamic Programs
    Halman, Nir
    [J]. INFORMS JOURNAL ON COMPUTING, 2020, 32 (04) : 1157 - 1181
  • [4] An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms
    Kamanchi, Chandramouli
    Diddigi, Raghuram Bharadwaj
    Prabuchandran, K. J.
    Bhatnagar, Shalabh
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2019, 3 (03): : 697 - 702
  • [5] Sample-Based Optimal Pricing
    Allouah, Amine
    Besbes, Omar
    [J]. ACM EC '19: PROCEEDINGS OF THE 2019 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2019, : 391 - 391
  • [6] Sequence-Based Stochastic Receding Horizon Control Using IMM Filtering and Value Function Approximation
    Rosenthal, Florian
    Hanebeck, Uwe D.
    [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 6424 - 6430
  • [7] Local regularity of the value function in optimal control
    Cannarsa, P.
    Frankowska, H.
    [J]. SYSTEMS & CONTROL LETTERS, 2013, 62 (09) : 791 - 794
  • [8] Sample-based estimation of correlation ratio with polynomial approximation
    Lewandowski, Daniel
    Cooke, Roger M.
    Tebbens, Radboud J. Duintjer
    [J]. ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2008, 18 (01):
  • [9] Sample-based polynomial approximation of rational Bezier curves
    Lu, Lizheng
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2011, 235 (06) : 1557 - 1563
  • [10] Sample-Based Optimal Transport and Barycenter Problems
    Kuang, Max
    Tabak, Esteban G.
    [J]. COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2019, 72 (08) : 1581 - 1630