Fast Stochastic Kalman Gradient Descent for Reinforcement Learning

被引：0

作者：

Totaro, Simone ^{[1
]}

Jonsson, Anders ^{[1
]}

机构：

[1] Univ Pompeu Fabra, Dept Informat & Commun Technol, Barcelona, Spain

来源：

LEARNING FOR DYNAMICS AND CONTROL, VOL 144 | 2021年 / 144卷

关键词：

Non-stationary MDPs; Reinforcement Learning; Tracking;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As we move towards real world applications, there is an increasing need for scalable, online optimization algorithms capable of dealing with the non-stationarity of the real world. We revisit the problem of online policy evaluation in non-stationary deterministic MDPs through the lense of Kalman filtering. We introduce a randomized regularization technique called Stochastic Kalman Gradient Descent (SKGD) that, combined with a low rank update, generates a sequence of feasible iterates. SKGD is suitable for large scale optimization of non-linear function approximators. We evaluate the performance of SKGD in two controlled experiments, and in one real world application of microgrid control. In our experiments, SKGD is more robust to drift in the transition dynamics than state-of-the-art reinforcement learning algorithms, and the resulting policies are smoother.

引用

页数：12

共 50 条

[1] Robust and Fast Learning of Sparse Codes With Stochastic Gradient Descent
Labusch, Kai
Barth, Erhardt
Martinetz, Thomas
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) : 1048 - 1060
[2] Gradient descent for general reinforcement learning
Baird, L
Moore, A
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 968 - 974
[3] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
Hu Dongting
Shen Wen
Ma Wenchao
Liu Xinyu
Su Zhouping
Zhu Huaxin
Zhang Xiumei
Que Lizhi
Zhu Zhuowei
Zhang Yixin
Chen Guoqing
Hu Lifa
LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)
[4] Nystrom-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent
Pfahler, Lukas
Morik, Katharina
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 209 - 224
[5] Fast Mixing of Stochastic Gradient Descent with Normalization and Weight Decay
Li, Zhiyuan
Wang, Tianhao
Yu, Dingli
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[6] Stochastic Gradient Descent and Its Variants in Machine Learning
Netrapalli, Praneeth
JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
[7] Towards Learning Stochastic Population Models by Gradient Descent
Kreikemeyer, Justin N.
Andelfinger, Philipp
Uhrmacher, Adelinde M.
PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 88 - 92
[8] Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
Mitra, Partha P.
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1890 - 1894
[9] Stochastic Gradient Descent with Polyak's Learning Rate
Prazeres, Mariana
Oberman, Adam M.
JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
[10] Stochastic gradient descent and fast relaxation to thermodynamic equilibrium: A stochastic control approach
Breiten, Tobias
Hartmann, Carsten
Neureither, Lara
Sharma, Upanshu
JOURNAL OF MATHEMATICAL PHYSICS, 2021, 62 (12)

← 1 2 3 4 5 →