Heavy-Tailed Reinforcement Learning With Penalized Robust Estimator

被引:0
|
作者
Park, Hyeon-Jun [1 ]
Lee, Kyungjae [1 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Noise measurement; Heavily-tailed distribution; Q-learning; Stochastic processes; Random variables; Object recognition; Markov decision processes; Reinforcement learning; heavy-tailed noise; regret analysis;
D O I
10.1109/ACCESS.2024.3424828
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider finite-horizon episodic reinforcement learning (RL) under heavy-tailed noises, where the p-th moment is bounded for any p is an element of (1,2]. In this setting, existing RL algorithms are limited by their requirement for prior knowledge about the bounded moment order of the noise distribution. This requirement hinders their practical application, as such prior information is rarely available in real-world scenarios. Our proposed method eliminates the need for this prior knowledge, enabling implementation in a wider range of scenarios. We introduce two RL algorithms, p-Heavy-UCRL and p-Heavy-Q-learning, designed for model-based and model-free RL settings, respectively. Without the need for prior knowledge, these algorithms demonstrate robustness to heavy-tailed noise and achieve nearly optimal regret bounds, up to logarithmic terms, with the same dependencies on dominating terms as existing algorithms. Finally, we show that our proposed algorithms have empirically comparable performance to existing algorithms in synthetic tabular scenario.
引用
收藏
页码:107800 / 107817
页数:18
相关论文
共 50 条
  • [21] An improved permutation-based FDR estimator for heavy-tailed data
    Yu, Wei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023,
  • [22] An exponential-squared estimator in the autoregressive model with heavy-tailed errors
    Jiang, Yunlu
    STATISTICS AND ITS INTERFACE, 2016, 9 (02) : 233 - 238
  • [23] Heavy-tailed densities
    Rojo, Javier
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (01): : 30 - 40
  • [24] Detection algorithm of robust neural network for heavy-tailed noise
    Research Institute of Information Fusion, Naval Aeronautical Engineering Institute, Yantai 264001, China
    Dianzi Yu Xinxi Xuebao, 2007, 8 (1864-1867):
  • [25] ARFIS: An adaptive robust model for regression with heavy-tailed distribution
    Su, Meihong
    Zhang, Jifu
    Guo, Yaqing
    Wang, Wenjian
    INFORMATION SCIENCES, 2024, 686
  • [26] Performance of robust detector for very heavy-tailed noise distribution
    Sato, Tatsuo
    Yumoto, Hideki
    Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi), 1992, 75 (09): : 85 - 102
  • [27] Robust directed tests of normality against heavy-tailed alternatives
    Gel, Yulia R.
    Miao, Weiwen
    Gastwirth, Joseph L.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (05) : 2734 - 2746
  • [28] ROBUST RECURSIVE ESTIMATION IN THE PRESENCE OF HEAVY-TAILED OBSERVATION NOISE
    SCHICK, IC
    MITTER, SK
    ANNALS OF STATISTICS, 1994, 22 (02): : 1045 - 1080
  • [29] ROBUST ESTIMATION OF STRUCTURED COVARIANCE MATRIX FOR HEAVY-TAILED DISTRIBUTIONS
    Sun, Ying
    Babu, Prabhu
    Palomar, Daniel P.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5693 - 5697
  • [30] Robust iterative decoding of turbo codes in heavy-tailed noise
    Chuah, TC
    IEE PROCEEDINGS-COMMUNICATIONS, 2005, 152 (01): : 29 - 38