Fast Stochastic Kalman Gradient Descent for Reinforcement Learning

被引:0
|
作者
Totaro, Simone [1 ]
Jonsson, Anders [1 ]
机构
[1] Univ Pompeu Fabra, Dept Informat & Commun Technol, Barcelona, Spain
关键词
Non-stationary MDPs; Reinforcement Learning; Tracking;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As we move towards real world applications, there is an increasing need for scalable, online optimization algorithms capable of dealing with the non-stationarity of the real world. We revisit the problem of online policy evaluation in non-stationary deterministic MDPs through the lense of Kalman filtering. We introduce a randomized regularization technique called Stochastic Kalman Gradient Descent (SKGD) that, combined with a low rank update, generates a sequence of feasible iterates. SKGD is suitable for large scale optimization of non-linear function approximators. We evaluate the performance of SKGD in two controlled experiments, and in one real world application of microgrid control. In our experiments, SKGD is more robust to drift in the transition dynamics than state-of-the-art reinforcement learning algorithms, and the resulting policies are smoother.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Robust and Fast Learning of Sparse Codes With Stochastic Gradient Descent
    Labusch, Kai
    Barth, Erhardt
    Martinetz, Thomas
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) : 1048 - 1060
  • [2] Gradient descent for general reinforcement learning
    Baird, L
    Moore, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 968 - 974
  • [3] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
    Hu Dongting
    Shen Wen
    Ma Wenchao
    Liu Xinyu
    Su Zhouping
    Zhu Huaxin
    Zhang Xiumei
    Que Lizhi
    Zhu Zhuowei
    Zhang Yixin
    Chen Guoqing
    Hu Lifa
    LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)
  • [4] Nystrom-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent
    Pfahler, Lukas
    Morik, Katharina
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 209 - 224
  • [5] Fast Mixing of Stochastic Gradient Descent with Normalization and Weight Decay
    Li, Zhiyuan
    Wang, Tianhao
    Yu, Dingli
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Stochastic Gradient Descent and Its Variants in Machine Learning
    Netrapalli, Praneeth
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
  • [7] Towards Learning Stochastic Population Models by Gradient Descent
    Kreikemeyer, Justin N.
    Andelfinger, Philipp
    Uhrmacher, Adelinde M.
    PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 88 - 92
  • [8] Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
    Mitra, Partha P.
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1890 - 1894
  • [9] Stochastic Gradient Descent with Polyak's Learning Rate
    Prazeres, Mariana
    Oberman, Adam M.
    JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
  • [10] Stochastic gradient descent and fast relaxation to thermodynamic equilibrium: A stochastic control approach
    Breiten, Tobias
    Hartmann, Carsten
    Neureither, Lara
    Sharma, Upanshu
    JOURNAL OF MATHEMATICAL PHYSICS, 2021, 62 (12)