Heavy-Tailed Reinforcement Learning With Penalized Robust Estimator

被引:0
|
作者
Park, Hyeon-Jun [1 ]
Lee, Kyungjae [1 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Noise measurement; Heavily-tailed distribution; Q-learning; Stochastic processes; Random variables; Object recognition; Markov decision processes; Reinforcement learning; heavy-tailed noise; regret analysis;
D O I
10.1109/ACCESS.2024.3424828
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider finite-horizon episodic reinforcement learning (RL) under heavy-tailed noises, where the p-th moment is bounded for any p is an element of (1,2]. In this setting, existing RL algorithms are limited by their requirement for prior knowledge about the bounded moment order of the noise distribution. This requirement hinders their practical application, as such prior information is rarely available in real-world scenarios. Our proposed method eliminates the need for this prior knowledge, enabling implementation in a wider range of scenarios. We introduce two RL algorithms, p-Heavy-UCRL and p-Heavy-Q-learning, designed for model-based and model-free RL settings, respectively. Without the need for prior knowledge, these algorithms demonstrate robustness to heavy-tailed noise and achieve nearly optimal regret bounds, up to logarithmic terms, with the same dependencies on dominating terms as existing algorithms. Finally, we show that our proposed algorithms have empirically comparable performance to existing algorithms in synthetic tabular scenario.
引用
收藏
页码:107800 / 107817
页数:18
相关论文
共 50 条
  • [1] Robust Offline Reinforcement Learning with Heavy-Tailed Rewards
    Zhu, Jin
    Wan, Runzhe
    Qi, Zhengling
    Luo, Shikai
    Shi, Chengchun
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [2] No-Regret Reinforcement Learning with Heavy-Tailed Rewards
    Zhuang, Vincent
    Sui, Yanan
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [3] Robust estimator of the ruin probability in infinite time for heavy-tailed distributions
    Deme, El Hadji
    Slaoui, Yousri
    Kebe, Modou
    Manou-Abi, Solym
    STATISTICS, 2024, 58 (06) : 1401 - 1422
  • [4] Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
    Cayci, Semih
    Eryilmaz, Atilla
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] A parametric alternative to the Hill estimator for heavy-tailed distributions
    Kim, Joseph H. T.
    Kim, Joocheol
    JOURNAL OF BANKING & FINANCE, 2015, 54 : 60 - 71
  • [6] A new extreme quantile estimator for heavy-tailed distributions
    Fils, A
    Guillou, A
    COMPTES RENDUS MATHEMATIQUE, 2004, 338 (06) : 493 - 498
  • [7] Heavy-Tailed Model for Visual Tracking via Robust Subspace Learning
    Wang, Daojing
    Zhang, Chao
    Hao, Pengwei
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 172 - 181
  • [8] Robust Heavy-Tailed Linear Bandits Algorithm
    Ma L.
    Zhao P.
    Zhou Z.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (06): : 1385 - 1395
  • [9] Robust Nonparametric Regression for Heavy-Tailed Data
    Gorji, Ferdos
    Aminghafari, Mina
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2020, 25 (03) : 277 - 291
  • [10] Robust Matrix Completion with Heavy-Tailed Noise
    Wang, Bingyan
    Fan, Jianqing
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,