Heavy-Tailed Reinforcement Learning With Penalized Robust Estimator

被引：0

作者：

Park, Hyeon-Jun ^{[1
]}

Lee, Kyungjae ^{[1
]}

机构：

[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Noise measurement; Heavily-tailed distribution; Q-learning; Stochastic processes; Random variables; Object recognition; Markov decision processes; Reinforcement learning; heavy-tailed noise; regret analysis;

D O I：

10.1109/ACCESS.2024.3424828

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider finite-horizon episodic reinforcement learning (RL) under heavy-tailed noises, where the p-th moment is bounded for any p is an element of (1,2]. In this setting, existing RL algorithms are limited by their requirement for prior knowledge about the bounded moment order of the noise distribution. This requirement hinders their practical application, as such prior information is rarely available in real-world scenarios. Our proposed method eliminates the need for this prior knowledge, enabling implementation in a wider range of scenarios. We introduce two RL algorithms, p-Heavy-UCRL and p-Heavy-Q-learning, designed for model-based and model-free RL settings, respectively. Without the need for prior knowledge, these algorithms demonstrate robustness to heavy-tailed noise and achieve nearly optimal regret bounds, up to logarithmic terms, with the same dependencies on dominating terms as existing algorithms. Finally, we show that our proposed algorithms have empirically comparable performance to existing algorithms in synthetic tabular scenario.

引用

页码：107800 / 107817

页数：18

共 50 条

[31] Performance Analysis of a Robust Wavelet Threshold for Heavy-tailed Noises
Wei Guangfen
Su Feng
Jian Tao
ADVANCES IN SCIENCE AND ENGINEERING, PTS 1 AND 2, 2011, 40-41 : 979 - +
[32] Study on the Robust Wavelet Threshold Technique for Heavy-tailed Noises
Wei, Guangfen
Su, Feng
Jian, Tao
JOURNAL OF COMPUTERS, 2011, 6 (06) : 1246 - 1253
[33] Graph Learning for Balanced Clustering of Heavy-Tailed Data
Javaheri, Amirhossein
Cardoso, Jose Vinicius de M.
Palomar, Daniel P.
2023 IEEE 9TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING, CAMSAP, 2023, : 481 - 485
[34] Kernel-type estimator of the conditional tail expectation for a heavy-tailed distribution
Rassoul, Abdelaziz
INSURANCE MATHEMATICS & ECONOMICS, 2013, 53 (03): : 698 - 703
[35] Minimum of heavy-tailed random variables is not heavy tailed
Leipus, Remigijus
Siaulys, Jonas
Konstantinides, Dimitrios
AIMS MATHEMATICS, 2023, 8 (06): : 13066 - 13072
[36] Kernel-type estimator of the reinsurance premium for heavy-tailed loss distributions
Benkhelifa, Lazhar
INSURANCE MATHEMATICS & ECONOMICS, 2014, 59 : 65 - 70
[37] Reduced-bias estimator of the Proportional Hazard Premium for heavy-tailed distributions
Deme, El Hadji
Girard, Stephane
Guillou, Armelle
INSURANCE MATHEMATICS & ECONOMICS, 2013, 52 (03): : 550 - 559
[38] MODEL-BASED CLUSTERING WITH GENE RANKING USING PENALIZED MIXTURES OF HEAVY-TAILED DISTRIBUTIONS
Cozzini, Alberto
Jasra, Ajay
Montana, Giovanni
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2013, 11 (03)
[39] Robust tests of stock return predictability under heavy-tailed innovations
WONG HsinChieh
CHUNG MengHua
FUH ChengDer
PANG Tianxiao
Applied Mathematics:A Journal of Chinese Universities, 2025, 40 (01) : 149 - 168
[40] Robust tests of stock return predictability under heavy-tailed innovations
Wong, Hsin-Chieh
Chung, Meng-Hua
Fuh, Cheng-Der
Pang, Tian-xiao
APPLIED MATHEMATICS-A JOURNAL OF CHINESE UNIVERSITIES SERIES B, 2025, 40 (01) : 149 - 168

← 1 2 3 4 5 →