Learning with eligibility traces in adaptive critic designs

被引:0
|
作者
Xu, Jing [1 ]
Liang, Fu-Ming [1 ]
Yu, Wen-Sheng [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
adaptive critic designs; eligibility traces; action-dependent heuristic dynamic programming;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the training strategies of the critic network in adaptive critic designs. The conventional specified is always to conduct an internal training cycle for the specified object at each time step based on relative information in two consecutive moments. Whereas, in our work, the mechanism eligibility traces is adopted for the learning of the cost function or its derivatives. The new learning depends on the current error combined with traces of past events. For the classical single cart-adaptive critic design, i.e. action-dependent heuristic dynamic programming. And comparing results demonstrate our approach with more efficiency in the performances such as learning speed and success rate of learning.
引用
收藏
页码:309 / +
页数:2
相关论文
共 50 条
  • [41] Meta-Learning of Exploration and Exploitation Parameters with Replacing Eligibility Traces
    Tokic, Michel
    Schwenker, Friedhelm
    Palm, Guenther
    PARTIALLY SUPERVISED LEARNING, PSL 2013, 2013, 8193 : 68 - 79
  • [42] Community energy storage operation via reinforcement learning with eligibility traces
    Duque, Edgar Mauricio Salazar
    Giraldo, Juan S.
    Vergara, Pedro P.
    Nguyen, Phuong
    van der Molen, Anne
    Slootweg, Han
    ELECTRIC POWER SYSTEMS RESEARCH, 2022, 212
  • [43] SVM-based tree-type neural networks as a critic in adaptive critic designs for control
    Deb, Alok Kanti
    Jayadeva
    Gopal, Madan
    Chandra, Suresh
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (04): : 1016 - 1030
  • [44] Towards adaptive learning designs
    Berlanga, A
    García, FJ
    ADAPTIVE HYPERMEDIA AND ADAPTIVE WEB-BASED SYSTEMS, PROCEEDINGS, 2004, 3137 : 372 - 375
  • [45] Fully Evolvable Optimal Neurofuzzy Controller Using Adaptive Critic Designs
    Mohagheghi, Salman
    Venayagamoorthy, Ganesh K.
    Harley, Ronald G.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2008, 16 (06) : 1450 - 1461
  • [46] Complete stability analysis of iterative adaptive critic designs with discounted cost
    Liang, Zhantao
    Ha, Mingming
    Liu, Derong
    Wang, Yonghua
    NONLINEAR DYNAMICS, 2024, 112 (17) : 15427 - 15443
  • [47] Adaptive critic designs for problems with known analytical form of cost function
    Liu, DR
    PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 1808 - 1813
  • [48] Adaptive Optimal Control of Nonlinear Systems with Multiple Time-scale Eligibility Traces
    Rao, Jun
    Wang, Jingcheng
    Xu, Jiahui
    Wu, Shunyu
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 2258 - 2263
  • [49] Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
    Daley, Brett
    White, Martha
    Amato, Christopher
    Machado, Marlos C.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [50] One-shot learning and behavioral eligibility traces in sequential decision making
    Lehmann, Marco P.
    Xu, He A.
    Liakoni, Vasiliki
    Herzog, Michael H.
    Gerstner, Wulfram
    Preuschoff, Kerstin
    ELIFE, 2019, 8 : 1 - 32