An Online Actor-Critic Learning Approach with Levenberg-Marquardt Algorithm

被引:0
|
作者
Ni, Zhen [1 ]
He, Haibo [1 ]
Prokhorov, Danil V. [2 ]
Fu, Jian [3 ]
机构
[1] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
[2] Toyota Res Inst NA, TTC, Ann Arbor, MI 48105 USA
[3] Wuhan Univ Technol, Sch Automat, Wuhan 430070, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on the efficiency improvement of online actor-critic design base on the Levenberg-Marquardt (LM) algorithm rather than traditional chain rule. Over the decades, several generations of adaptive/approximate dynamic programming (ADP) structures have been proposed in the community and demonstrated many successfully applications. Neural network with backpropagation has been one of the most important approaches to tune the parameters in such ADP designs. In this paper, we aim to study the integration of Levenberg-Marquardt method into the regular actor-critic design to improve weights updating and learning for a quadratic convergence under certain condition. Specifically, for the critic network design, we adopt the LM method targeting improved learning performance, while for the action network, we use the neural network with backpropagation to provide an appropriate control action. A detailed learning algorithm is presented, followed by benchmark tests of pendulum swing up and balance and cart-pole balance tasks. Various simulation results and comparative study demonstrated the effectiveness of this approach.
引用
收藏
页码:2333 / 2340
页数:8
相关论文
共 50 条
  • [1] A New Computational Approach to the Levenberg-Marquardt Learning Algorithm
    Bilski, Jaroslaw
    Kowalczyk, Barosz
    Smolag, Jacek
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT I, 2023, 13588 : 16 - 26
  • [2] Parallel Approach to the Levenberg-Marquardt Learning Algorithm for Feedforward Neural Networks
    Bilski, Jaroslaw
    Smolag, Jacek
    Zurada, Jacek M.
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2015, 9119 : 3 - 14
  • [3] A Parallel Levenberg-Marquardt Algorithm
    Cao, Jun
    Novstrup, Krista A.
    Goyal, Ayush
    Midkiff, Samuel R.
    Caruthers, James M.
    [J]. ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, : 450 - 459
  • [4] Adaptive Levenberg-Marquardt Algorithm: A New Optimization Strategy for Levenberg-Marquardt Neural Networks
    Yan, Zhiqi
    Zhong, Shisheng
    Lin, Lin
    Cui, Zhiquan
    [J]. MATHEMATICS, 2021, 9 (17)
  • [5] Online Levenberg-Marquardt Algorithm for Digital Predistortion Based on Direct Learning and Indirect Learning Architectures
    Chen Limin
    Liang Yin
    Wan Guojin
    [J]. FOURTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2012), 2012, 8334
  • [6] The Parallel Modification to the Levenberg-Marquardt Algorithm
    Bilski, Jaroslaw
    Kowalczyk, Bartosz
    Grzanek, Konrad
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 15 - 24
  • [7] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    [J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [8] The application and modeling of the Levenberg-Marquardt algorithm
    Li, Jian-rong
    [J]. 2010 2ND INTERNATIONAL CONFERENCE ON E-BUSINESS AND INFORMATION SYSTEM SECURITY (EBISS 2010), 2010, : 278 - 280
  • [9] Levenberg-Marquardt Deep Learning Algorithm for Sulfur Dioxide Prediction
    Asklany, Somia
    Mansouri, Wahida
    Othmen, Salwa
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (12): : 7 - 12
  • [10] LOCAL LEVENBERG-MARQUARDT ALGORITHM FOR LEARNING FEEDFORWAD NEURAL NETWORKS
    Bilski, Jaroslaw
    Kowalczyk, Bartosz
    Marchlewska, Alina
    Zurada, Jacek M.
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2020, 10 (04) : 299 - 316