Learning Stochastic Optimal Policies via Gradient Descent

被引:3
|
作者
Massaroli, Stefano [1 ]
Poli, Michael [2 ]
Peluchetti, Stefano [3 ]
Park, Jinkyoo [2 ]
Yamashita, Atsushi [1 ]
Asama, Hajime [1 ]
机构
[1] Univ Tokyo, Dept Precis Engn, Tokyo 1138656, Japan
[2] Korea Adv Inst Sci & Technol, Dept Ind & Syst, Daejeon 305335, South Korea
[3] Cogent Labs, Daejeon 305701, South Korea
来源
基金
新加坡国家研究基金会;
关键词
Optimal control; Indium tin oxide; Stochastic processes; Process control; Optimization; Neural networks; Noise measurement; stochastic processes; machine learning; PORTFOLIO SELECTION; CONVERGENCE; ITO;
D O I
10.1109/LCSYS.2021.3086672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controller, we optimize their parameters via iterative gradient descent methods. In doing so, we extend the range of applicability of classical SOC techniques, often requiring strict assumptions on the functional form of system and control. We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.
引用
收藏
页码:1094 / 1099
页数:6
相关论文
共 50 条
  • [31] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
    Liu, Kangqiao
    Liu Ziyin
    Ueda, Masahito
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [32] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204
  • [33] Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning
    Guo, Pengzhan
    Ye, Zeyang
    Xiao, Keli
    Zhu, Wei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (10) : 5037 - 5050
  • [34] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
  • [35] Stability and optimization error of stochastic gradient descent for pairwise learning
    Shen, Wei
    Yang, Zhenhuan
    Ying, Yiming
    Yuan, Xiaoming
    ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 887 - 927
  • [36] Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
    Denevi, Giulia
    Ciliberto, Carlo
    Grazzi, Riccardo
    Pontil, Massimiliano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [37] A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
    Le Lan, Charline
    Greaves, Joshua
    Farebrother, Jesse
    Rowland, Mark
    Pedregosa, Fabian
    Agarwal, Rishabh
    Bellemare, Marc
    arXiv, 2022,
  • [38] A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
    Le Lan, Charline
    Greaves, Joshua
    Farebrother, Jesse
    Rowland, Mark
    Pedregosa, Fabian
    Agarwal, Rishabh
    Bellemare, Marc
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [39] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [40] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466