Learning Stochastic Optimal Policies via Gradient Descent

被引:3
|
作者
Massaroli, Stefano [1 ]
Poli, Michael [2 ]
Peluchetti, Stefano [3 ]
Park, Jinkyoo [2 ]
Yamashita, Atsushi [1 ]
Asama, Hajime [1 ]
机构
[1] Univ Tokyo, Dept Precis Engn, Tokyo 1138656, Japan
[2] Korea Adv Inst Sci & Technol, Dept Ind & Syst, Daejeon 305335, South Korea
[3] Cogent Labs, Daejeon 305701, South Korea
来源
基金
新加坡国家研究基金会;
关键词
Optimal control; Indium tin oxide; Stochastic processes; Process control; Optimization; Neural networks; Noise measurement; stochastic processes; machine learning; PORTFOLIO SELECTION; CONVERGENCE; ITO;
D O I
10.1109/LCSYS.2021.3086672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controller, we optimize their parameters via iterative gradient descent methods. In doing so, we extend the range of applicability of classical SOC techniques, often requiring strict assumptions on the functional form of system and control. We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.
引用
收藏
页码:1094 / 1099
页数:6
相关论文
共 50 条
  • [1] Learning nonlinear feedback controllers from data via optimal policy search and stochastic gradient descent
    Ferrarotti, Laura
    Bemporad, Alberto
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 4961 - 4966
  • [2] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
    Archibald, Richard
    Bao, Feng
    Yong, Jiongmin
    EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
  • [3] Optimal stochastic gradient descent algorithm for filtering
    Turali, M. Yigit
    Koc, Ali T.
    Kozat, Suleyman S.
    DIGITAL SIGNAL PROCESSING, 2024, 155
  • [4] Synthesis of Optimal Feedback Controllers from Data via Stochastic Gradient Descent
    Ferrarotti, Laura
    Bemporad, Alberto
    2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), 2019, : 2486 - 2491
  • [5] Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
    Li, Yuanzhi
    Liang, Yingyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] Opposite Online Learning via Sequentially Integrated Stochastic Gradient Descent Estimators
    Cui, Wenhai
    Ji, Xiaoting
    Kong, Linglong
    Yan, Xiaodong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7270 - 7278
  • [7] Learning ReLUs via Gradient Descent
    Soltanolkotabi, Mandi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [8] Stochastic Approximate Gradient Descent via the Langevin Algorithm
    Qiu, Yixuan
    Wang, Xiao
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5428 - 5435
  • [9] Fast Stochastic Kalman Gradient Descent for Reinforcement Learning
    Totaro, Simone
    Jonsson, Anders
    LEARNING FOR DYNAMICS AND CONTROL, VOL 144, 2021, 144
  • [10] Stochastic Gradient Descent and Its Variants in Machine Learning
    Netrapalli, Praneeth
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213