Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

被引:4
|
作者
Zhong, Shan [1 ,2 ]
Liu, Quan [1 ,3 ,4 ]
Fu, QiMing [5 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215000, Jiangsu, Peoples R China
[2] Changshu Inst Technol, Sch Comp Sci & Engn, Changshu 215500, Jiangsu, Peoples R China
[3] Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 210000, Jiangsu, Peoples R China
[4] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China
[5] Suzhou Univ Sci & Technol, Coll Elect & Informat Engn, Suzhou 215000, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
CONTINUOUS-TIME; REINFORCEMENT;
D O I
10.1155/2016/4824072
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with l(2)-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the global models, which are learned at the same time during learning of the value function and the policy, are approximated by local linear regression (LLR) and linear function approximation (LFA), respectively. Both the local model and the global model are applied to generate samples for planning; the former is used only if the state-prediction error does not surpass the threshold at each time step, while the latter is utilized at the end of each episode. The purpose of taking both models is to improve the sample efficiency and accelerate the convergence rate of the whole algorithm through fully utilizing the local and global information. Experimentally, AC-HMLP and RAC-HMLP are compared with three representative algorithms on two Reinforcement Learning (RL) benchmark problems. The results demonstrate that they perform best in terms of convergence rate and sample efficiency.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Efficient Actor-critic Algorithm with Dual Piecewise Model Learning
    Zhong, Shan
    Liu, Quan
    Gong, Shengrong
    Fu, Qiming
    Xu, Jin
    [J]. 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 823 - 830
  • [2] Efficient Model Learning Methods for Actor-Critic Control
    Grondman, Ivo
    Vaandrager, Maarten
    Busoniu, Lucian
    Babuska, Robert
    Schuitema, Erik
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602
  • [3] An Actor-Critic Hierarchical Reinforcement Learning Model for Course Recommendation
    Liang, Kun
    Zhang, Guoqiang
    Guo, Jinhui
    Li, Wentao
    [J]. ELECTRONICS, 2023, 12 (24)
  • [4] Curious Hierarchical Actor-Critic Reinforcement Learning
    Roeder, Frank
    Eppe, Manfred
    Nguyen, Phuong D. H.
    Wermter, Stefan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
  • [5] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    [J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [6] A World Model for Actor-Critic in Reinforcement Learning
    Panov, A. I.
    Ugadiarov, L. A.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
  • [7] A Hessian Actor-Critic Algorithm
    Wang, Jing
    Paschalidis, Ioannis Ch
    [J]. 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 1131 - 1136
  • [8] Twin Delayed Hierarchical Actor-Critic
    Anca, Mihai
    Studley, Matthew
    [J]. 2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 221 - 225
  • [9] An Actor-Critic Algorithm With Second-Order Actor and Critic
    Wang, Jing
    Paschalidis, Ioannis Ch.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
  • [10] Network Congestion Control Algorithm Based on Actor-Critic Reinforcement Learning Model
    Xu, Tao
    Gong, Lina
    Zhang, Wei
    Li, Xuhong
    Wang, Xia
    Pan, Wenwen
    [J]. ADVANCES IN MATERIALS, MACHINERY, ELECTRONICS II, 2018, 1955