Fast Stochastic Kalman Gradient Descent for Reinforcement Learning

被引：0

作者：

Totaro, Simone ^{[1
]}

Jonsson, Anders ^{[1
]}

机构：

[1] Univ Pompeu Fabra, Dept Informat & Commun Technol, Barcelona, Spain

来源：

LEARNING FOR DYNAMICS AND CONTROL, VOL 144 | 2021年 / 144卷

关键词：

Non-stationary MDPs; Reinforcement Learning; Tracking;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As we move towards real world applications, there is an increasing need for scalable, online optimization algorithms capable of dealing with the non-stationarity of the real world. We revisit the problem of online policy evaluation in non-stationary deterministic MDPs through the lense of Kalman filtering. We introduce a randomized regularization technique called Stochastic Kalman Gradient Descent (SKGD) that, combined with a low rank update, generates a sequence of feasible iterates. SKGD is suitable for large scale optimization of non-linear function approximators. We evaluate the performance of SKGD in two controlled experiments, and in one real world application of microgrid control. In our experiments, SKGD is more robust to drift in the transition dynamics than state-of-the-art reinforcement learning algorithms, and the resulting policies are smoother.

引用

页数：12

共 50 条

[31] Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning
Guo, Pengzhan
Ye, Zeyang
Xiao, Keli
Zhu, Wei
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (10) : 5037 - 5050
[32] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
[33] Stability and optimization error of stochastic gradient descent for pairwise learning
Shen, Wei
Yang, Zhenhuan
Ying, Yiming
Yuan, Xiaoming
ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 887 - 927
[34] Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
Denevi, Giulia
Ciliberto, Carlo
Grazzi, Riccardo
Pontil, Massimiliano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[35] A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
Le Lan, Charline
Greaves, Joshua
Farebrother, Jesse
Rowland, Mark
Pedregosa, Fabian
Agarwal, Rishabh
Bellemare, Marc
arXiv, 2022,
[36] A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
Le Lan, Charline
Greaves, Joshua
Farebrother, Jesse
Rowland, Mark
Pedregosa, Fabian
Agarwal, Rishabh
Bellemare, Marc
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
[37] Unforgeability in Stochastic Gradient Descent
Baluta, Teodora
Nikolic, Ivica
Jain, Racchit
Aggarwal, Divesh
Saxena, Prateek
PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
[38] Preconditioned Stochastic Gradient Descent
Li, Xi-Lin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
[39] Stochastic Reweighted Gradient Descent
El Hanchi, Ayoub
Stephens, David A.
Maddison, Chris J.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[40] Stochastic gradient descent tricks
Bottou, Léon
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436

← 1 2 3 4 5 →