Fast Stochastic Kalman Gradient Descent for Reinforcement Learning

被引：0

作者：

Totaro, Simone ^{[1
]}

Jonsson, Anders ^{[1
]}

机构：

[1] Univ Pompeu Fabra, Dept Informat & Commun Technol, Barcelona, Spain

来源：

LEARNING FOR DYNAMICS AND CONTROL, VOL 144 | 2021年 / 144卷

关键词：

Non-stationary MDPs; Reinforcement Learning; Tracking;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As we move towards real world applications, there is an increasing need for scalable, online optimization algorithms capable of dealing with the non-stationarity of the real world. We revisit the problem of online policy evaluation in non-stationary deterministic MDPs through the lense of Kalman filtering. We introduce a randomized regularization technique called Stochastic Kalman Gradient Descent (SKGD) that, combined with a low rank update, generates a sequence of feasible iterates. SKGD is suitable for large scale optimization of non-linear function approximators. We evaluate the performance of SKGD in two controlled experiments, and in one real world application of microgrid control. In our experiments, SKGD is more robust to drift in the transition dynamics than state-of-the-art reinforcement learning algorithms, and the resulting policies are smoother.

引用

页数：12

共 50 条

[21] Learning to learn by gradient descent by gradient descent
Andrychowicz, Marcin
Denil, Misha
Colmenarejo, Sergio Gomez
Hoffman, Matthew W.
Pfau, David
Schaul, Tom
Shillingford, Brendan
de Freitas, Nando
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[22] FORBID: Fast Overlap Removal by Stochastic GradIent Descent for Graph Drawing
Giovannangeli, Loann
Lalanne, Frederic
Giot, Romain
Bourqui, Romain
GRAPH DRAWING AND NETWORK VISUALIZATION, GD 2022, 2023, 13764 : 61 - 76
[23] A fast non-monotone line search for stochastic gradient descent
Fathi Hafshejani, Sajad
Gaur, Daya
Hossain, Shahadat
Benkoczi, Robert
OPTIMIZATION AND ENGINEERING, 2024, 25 (02) : 605 - +
[24] A fast non-monotone line search for stochastic gradient descent
Sajad Fathi Hafshejani
Daya Gaur
Shahadat Hossain
Robert Benkoczi
Optimization and Engineering, 2024, 25 : 1105 - 1124
[25] Nonlinear Optimization Method Based on Stochastic Gradient Descent for Fast Convergence
Watanabe, Takahiro
Iima, Hitoshi
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 4198 - 4203
[26] Large-Scale Machine Learning with Stochastic Gradient Descent
Bottou, Leon
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
[27] Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning
Yang, Zhenhuan
Lei, Yunwen
Wang, Puyu
Yang, Tianbao
Ying, Yiming
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[28] Convergence diagnostics for stochastic gradient descent with constant learning rate
Chee, Jerry
Toulis, Panos
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[29] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Liu, Kangqiao
Liu Ziyin
Ueda, Masahito
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[30] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204

← 1 2 3 4 5 →