Optimizing Q-Learning with K-FAC AlgorithmOptimizing Q-Learning with K-FAC Algorithm

被引：0

作者：

Beltiukov, Roman ^{[1
]}

机构：

[1] Peter Great St Petersburg Polytech Univ, St Petersburg, Russia

来源：

ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019) | 2020年 / 1086卷

关键词：

Q-learning; K-FAC; Reinforcement learning; Natural gradient;

D O I：

10.1007/978-3-030-39575-9_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we present intermediate results of the application of Kronecker-factored Approximate curvature (K-FAC) algorithm to Q-learning problem. Being more expensive to compute than plain stochastic gradient descent, K-FAC allows the agent to converge a bit faster in terms of epochs compared to Adam on simple reinforcement learning tasks and tend to be more stable and less strict to hyperparameters selection. Considering the latest results we show that DDQN with K-FAC learns more quickly than with other optimizers and improves constantly in contradiction to similar with Adam or RMSProp.

引用

页码：3 / 8

页数：6

共 50 条

[41] Q-learning algorithm for optimal multilevel thresholding
Yin, PY
[J]. IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 335 - 340
[42] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
İzmitligil, Hasan
Karamancıoğlu, Abdurrahman
[J]. Sustainable Computing: Informatics and Systems, 2024, 43
[43] An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
Spano, Sergio
Cardarilli, Gian Carlo
Di Nunzio, Luca
Fazzolari, Rocco
Giardino, Daniele
Matta, Marco
Nannarelli, Alberto
Re, Marco
[J]. IEEE ACCESS, 2019, 7 : 186340 - 186351
[44] Q-learning with Logarithmic Regret
Yang, Kunhe
Yang, Lin F.
Du, Simon S.
[J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[45] Rich Information is Affordable: A Systematic Performance Analysis of Second-order Optimization Using K-FAC
Ueno, Yuichiro
Osawa, Kazuki
Tsuji, Yohei
Naruse, Akira
Yokota, Rio
[J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2145 - 2153
[46] Double Gumbel Q-Learning
Hui, David Yu-Tung
Courville, Aaron
Bacon, Pierre-Luc
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[47] Q-Learning: Theory and Applications
Clifton, Jesse
Laber, Eric
[J]. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 7, 2020, 2020, 7 : 279 - 301
[48] Adaptive Bases for Q-learning
Di Castro, Dotan
Mannor, Shie
[J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 4587 - 4593
[49] Interactive Q-Learning for Quantiles
Linn, Kristin A.
Laber, Eric B.
Stefanski, Leonard A.
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) : 638 - 649
[50] Q-Learning With Kalman Filters
Riakaiiaia, Kei
Miura, Takao
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2939 - 2947

← 1 2 3 4 5 →