Optimizing Q-Learning with K-FAC AlgorithmOptimizing Q-Learning with K-FAC Algorithm

被引:0
|
作者
Beltiukov, Roman [1 ]
机构
[1] Peter Great St Petersburg Polytech Univ, St Petersburg, Russia
关键词
Q-learning; K-FAC; Reinforcement learning; Natural gradient;
D O I
10.1007/978-3-030-39575-9_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present intermediate results of the application of Kronecker-factored Approximate curvature (K-FAC) algorithm to Q-learning problem. Being more expensive to compute than plain stochastic gradient descent, K-FAC allows the agent to converge a bit faster in terms of epochs compared to Adam on simple reinforcement learning tasks and tend to be more stable and less strict to hyperparameters selection. Considering the latest results we show that DDQN with K-FAC learns more quickly than with other optimizers and improves constantly in contradiction to similar with Adam or RMSProp.
引用
收藏
页码:3 / 8
页数:6
相关论文
共 50 条
  • [41] An ARM-based Q-learning algorithm
    Hsu, Yuan-Pao
    Hwang, Kao-Shing
    Lin, Hsin-Yi
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF CONTEMPORARY INTELLIGENT COMPUTING TECHNIQUES, 2007, 2 : 11 - +
  • [42] Q-learning algorithm for optimal multilevel thresholding
    Yin, PY
    [J]. IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 335 - 340
  • [43] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
    İzmitligil, Hasan
    Karamancıoğlu, Abdurrahman
    [J]. Sustainable Computing: Informatics and Systems, 2024, 43
  • [44] An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
    Spano, Sergio
    Cardarilli, Gian Carlo
    Di Nunzio, Luca
    Fazzolari, Rocco
    Giardino, Daniele
    Matta, Marco
    Nannarelli, Alberto
    Re, Marco
    [J]. IEEE ACCESS, 2019, 7 : 186340 - 186351
  • [45] Q-learning with Logarithmic Regret
    Yang, Kunhe
    Yang, Lin F.
    Du, Simon S.
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [46] Rich Information is Affordable: A Systematic Performance Analysis of Second-order Optimization Using K-FAC
    Ueno, Yuichiro
    Osawa, Kazuki
    Tsuji, Yohei
    Naruse, Akira
    Yokota, Rio
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2145 - 2153
  • [47] Q-learning with Nearest Neighbors
    Shah, Devavrat
    Xie, Qiaomin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [48] Two mode Q-learning
    Park, KH
    Kim, JH
    [J]. CEC: 2003 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-4, PROCEEDINGS, 2003, : 2449 - 2454
  • [49] Underestimation estimators to Q-learning
    Abliz, Patigul
    Ying, Shi
    [J]. INFORMATION SCIENCES, 2022, 607 : 173 - 185
  • [50] q-Learning in Continuous Time
    Jia, Yanwei
    Zhou, Xun Yu
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24