Learning with eligibility traces in adaptive critic designs

被引:0
|
作者
Xu, Jing [1 ]
Liang, Fu-Ming [1 ]
Yu, Wen-Sheng [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
adaptive critic designs; eligibility traces; action-dependent heuristic dynamic programming;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the training strategies of the critic network in adaptive critic designs. The conventional specified is always to conduct an internal training cycle for the specified object at each time step based on relative information in two consecutive moments. Whereas, in our work, the mechanism eligibility traces is adopted for the learning of the cost function or its derivatives. The new learning depends on the current error combined with traces of past events. For the classical single cart-adaptive critic design, i.e. action-dependent heuristic dynamic programming. And comparing results demonstrate our approach with more efficiency in the performances such as learning speed and success rate of learning.
引用
收藏
页码:309 / +
页数:2
相关论文
共 50 条
  • [1] Adaptive Eligibility Traces for Online Deep Reinforcement Learning
    Kobayashi, Taisuke
    INTELLIGENT AUTONOMOUS SYSTEMS 16, IAS-16, 2022, 412 : 417 - 428
  • [2] Adaptive critic designs
    Prokhorov, DV
    Wunsch, DC
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (05): : 997 - 1007
  • [3] Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning
    Shokri, Matin
    Khasteh, Seyed Hossein
    Aminifar, Amin
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2019, 21 (05) : 1443 - 1454
  • [4] Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning
    Matin Shokri
    Seyed Hossein Khasteh
    Amin Aminifar
    International Journal of Fuzzy Systems, 2019, 21 : 1443 - 1454
  • [5] Continuous adaptive critic designs
    Hanselmann, T
    Noakes, L
    Zaknich, A
    Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 3001 - 3006
  • [6] Eligibility traces as a synaptic substrate for learning
    Shouval, Harel Z.
    Kirkwood, Alfredo
    CURRENT OPINION IN NEUROBIOLOGY, 2025, 91
  • [7] Reinforcement learning with replacing eligibility traces
    Singh, SP
    Sutton, RS
    MACHINE LEARNING, 1996, 22 (1-3) : 123 - 158
  • [8] Novel Discounted Adaptive Critic Control Designs With Accelerated Learning Formulation
    Ha, Mingming
    Wang, Ding
    Liu, Derong
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (05) : 3003 - 3016
  • [9] Neurocontrol of turbogenerators with adaptive critic designs
    Venayagamoorthy, Ganesh K.
    Wunsch II, Donald C.
    Harley, Ronald G.
    IEEE AFRICON Conference, 1999, 1 : 489 - 494
  • [10] Adaptive and multiple time-scale eligibility traces for online deep reinforcement learning
    Kobayashi, Taisuke
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 151