An Online Actor-Critic Learning Approach with Levenberg-Marquardt Algorithm

被引：0

作者：

Ni, Zhen ^{[1
]}

He, Haibo ^{[1
]}

Prokhorov, Danil V. ^{[2
]}

Fu, Jian ^{[3
]}

机构：

[1] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA

[2] Toyota Res Inst NA, TTC, Ann Arbor, MI 48105 USA

[3] Wuhan Univ Technol, Sch Automat, Wuhan 430070, Peoples R China

来源：

2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2011年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper focuses on the efficiency improvement of online actor-critic design base on the Levenberg-Marquardt (LM) algorithm rather than traditional chain rule. Over the decades, several generations of adaptive/approximate dynamic programming (ADP) structures have been proposed in the community and demonstrated many successfully applications. Neural network with backpropagation has been one of the most important approaches to tune the parameters in such ADP designs. In this paper, we aim to study the integration of Levenberg-Marquardt method into the regular actor-critic design to improve weights updating and learning for a quadratic convergence under certain condition. Specifically, for the critic network design, we adopt the LM method targeting improved learning performance, while for the action network, we use the neural network with backpropagation to provide an appropriate control action. A detailed learning algorithm is presented, followed by benchmark tests of pendulum swing up and balance and cart-pole balance tasks. Various simulation results and comparative study demonstrated the effectiveness of this approach.

引用

页码：2333 / 2340

页数：8

共 50 条

[1] A New Computational Approach to the Levenberg-Marquardt Learning Algorithm
Bilski, Jaroslaw
Kowalczyk, Barosz
Smolag, Jacek
[J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT I, 2023, 13588 : 16 - 26
[2] Parallel Approach to the Levenberg-Marquardt Learning Algorithm for Feedforward Neural Networks
Bilski, Jaroslaw
Smolag, Jacek
Zurada, Jacek M.
[J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2015, 9119 : 3 - 14
[3] A Parallel Levenberg-Marquardt Algorithm
Cao, Jun
Novstrup, Krista A.
Goyal, Ayush
Midkiff, Samuel R.
Caruthers, James M.
[J]. ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, : 450 - 459
[4] Adaptive Levenberg-Marquardt Algorithm: A New Optimization Strategy for Levenberg-Marquardt Neural Networks
Yan, Zhiqi
Zhong, Shisheng
Lin, Lin
Cui, Zhiquan
[J]. MATHEMATICS, 2021, 9 (17)
[5] Online Levenberg-Marquardt Algorithm for Digital Predistortion Based on Direct Learning and Indirect Learning Architectures
Chen Limin
Liang Yin
Wan Guojin
[J]. FOURTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2012), 2012, 8334
[6] The Parallel Modification to the Levenberg-Marquardt Algorithm
Bilski, Jaroslaw
Kowalczyk, Bartosz
Grzanek, Konrad
[J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 15 - 24
[7] A modified actor-critic reinforcement learning algorithm
Mustapha, SM
Lachiver, G
[J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
[8] The application and modeling of the Levenberg-Marquardt algorithm
Li, Jian-rong
[J]. 2010 2ND INTERNATIONAL CONFERENCE ON E-BUSINESS AND INFORMATION SYSTEM SECURITY (EBISS 2010), 2010, : 278 - 280
[9] Levenberg-Marquardt Deep Learning Algorithm for Sulfur Dioxide Prediction
Asklany, Somia
Mansouri, Wahida
Othmen, Salwa
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (12): : 7 - 12
[10] LOCAL LEVENBERG-MARQUARDT ALGORITHM FOR LEARNING FEEDFORWAD NEURAL NETWORKS
Bilski, Jaroslaw
Kowalczyk, Bartosz
Marchlewska, Alina
Zurada, Jacek M.
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2020, 10 (04) : 299 - 316

← 1 2 3 4 5 →