Optimizing Q-Learning with K-FAC AlgorithmOptimizing Q-Learning with K-FAC Algorithm

被引:0
|
作者
Beltiukov, Roman [1 ]
机构
[1] Peter Great St Petersburg Polytech Univ, St Petersburg, Russia
关键词
Q-learning; K-FAC; Reinforcement learning; Natural gradient;
D O I
10.1007/978-3-030-39575-9_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present intermediate results of the application of Kronecker-factored Approximate curvature (K-FAC) algorithm to Q-learning problem. Being more expensive to compute than plain stochastic gradient descent, K-FAC allows the agent to converge a bit faster in terms of epochs compared to Adam on simple reinforcement learning tasks and tend to be more stable and less strict to hyperparameters selection. Considering the latest results we show that DDQN with K-FAC learns more quickly than with other optimizers and improves constantly in contradiction to similar with Adam or RMSProp.
引用
收藏
页码:3 / 8
页数:6
相关论文
共 50 条
  • [1] Convolutional Neural Network Training with Distributed K-FAC
    Pauloski, J. Gregory
    Zhang, Zhao
    Huang, Lei
    Xu, Weijia
    Foster, Ian T.
    [J]. PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [2] Deep Neural Network Training With Distributed K-FAC
    Pauloski, J. Gregory
    Huang, Lei
    Xu, Weijia
    Chard, Kyle
    Foster, Ian T.
    Zhang, Zhao
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3616 - 3627
  • [3] Inefficiency of K-FAC for Large Batch Size Training
    Ma, Linjian
    Montague, Gabe
    Ye, Jiayu
    Yao, Zhewei
    Gholami, Amir
    Keutzer, Kurt
    Mahoney, Michael W.
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5053 - 5060
  • [4] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
  • [5] 基于Sherman-Morrison公式的K-FAC算法
    刘小雷
    高凯新
    王勇
    [J]. 计算机系统应用, 2021, 30 (04) : 118 - 124
  • [6] Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
    Shi, Shaohuai
    Zhang, Lin
    Li, Bo
    [J]. 2021 IEEE 41ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2021), 2021, : 550 - 560
  • [7] Scalable K-FAC Training for Deep Neural Networks With Distributed Preconditioning
    Zhang, Lin
    Shi, Shaohuai
    Wang, Wei
    Li, Bo
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 2365 - 2378
  • [8] Randomized K-FACs: Speeding Up K-FAC with Randomized Numerical Linear Algebra
    Puiu, Constantin Octavian
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2022, 2022, 13756 : 411 - 422
  • [9] Q-LEARNING
    WATKINS, CJCH
    DAYAN, P
    [J]. MACHINE LEARNING, 1992, 8 (3-4) : 279 - 292
  • [10] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483