Exponential TD Learning: A Risk-Sensitive Actor-Critic Reinforcement Learning Algorithm

被引:0
|
作者
Noorani, Erfaun [1 ,2 ]
Mavridis, Christos N. [1 ,2 ]
Baras, John S. [1 ,2 ]
机构
[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
[2] Univ Maryland, Inst Syst Res ISR, College Pk, MD 20742 USA
来源
2023 AMERICAN CONTROL CONFERENCE, ACC | 2023年
关键词
D O I
10.23919/ACC55779.2023.10156626
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Incorporating risk in the decision-making process has been shown to lead to significant performance improvement in optimal control and reinforcement learning algorithms. We construct a temporal-difference risk-sensitive reinforcement learning algorithm using the exponential criteria commonly used in risk-sensitive control. The proposed method resembles an actor-critic architecture with the 'actor' implementing a policy gradient algorithm based on the exponential of the reward-to-go, which is estimated by the 'critic'. The novelty of the update rule of the 'critic' lies in the use of a modified objective function that corresponds to the underlying multiplicative Bellman's equation. Our results suggest that the use of the exponential criteria accelerates the learning process and reduces its variance, i.e., risk-sensitiveness can be utilized by actor-critic methods and can lead to improved performance.
引用
收藏
页码:4104 / 4109
页数:6
相关论文
共 50 条
  • [1] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [2] A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
    Borkar, VS
    SYSTEMS & CONTROL LETTERS, 2001, 44 (05) : 339 - 346
  • [3] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
    Kim, Youngjae
    Hussain, Manzoor
    Suh, Jae-Won
    Hong, Jang-Eui
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325
  • [4] A World Model for Actor-Critic in Reinforcement Learning
    Panov, A. I.
    Ugadiarov, L. A.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
  • [5] Actor-Critic based Improper Reinforcement Learning
    Zaki, Mohammadi
    Mohan, Avinash
    Gopalan, Aditya
    Mannor, Shie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Curious Hierarchical Actor-Critic Reinforcement Learning
    Roeder, Frank
    Eppe, Manfred
    Nguyen, Phuong D. H.
    Wermter, Stefan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
  • [7] Integrated Actor-Critic for Deep Reinforcement Learning
    Zheng, Jiaohao
    Kurt, Mehmet Necip
    Wang, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
  • [8] A fuzzy Actor-Critic reinforcement learning network
    Wang, Xue-Song
    Cheng, Yu-Hu
    Yi, Jian-Qiang
    INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
  • [9] Research on actor-critic reinforcement learning in RoboCup
    Guo, He
    Liu, Tianying
    Wang, Yuxin
    Chen, Feng
    Fan, Jianming
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 205 - 205
  • [10] Reinforcement actor-critic learning as a rehearsal in MicroRTS
    Manandhar, Shiron
    Banerjee, Bikramjit
    KNOWLEDGE ENGINEERING REVIEW, 2024, 39