On Convergence Rate of MRetrace

被引：0

作者：

Chen, Xingguo ^{[1
]}

Qin, Wangrong ^{[1
]}

Gong, Yu ^{[1
]}

Yang, Shangdong ^{[1
]}

Wang, Wenhao ^{[2
,3
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210023, Peoples R China

[2] Natl Univ Def Technol, Coll Elect Engn, Changsha 410073, Peoples R China

[3] Natl Univ Def Technol, Sci & Technol Informat Syst Engn Lab, Changsha 410073, Peoples R China

来源：

MATHEMATICS | 2024年 / 12卷 / 18期

关键词：

finite sample analysis; off-policy learning; minimum eigenvalues; MRetrace;

D O I：

10.3390/math12182930

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Off-policy is a key setting for reinforcement learning algorithms. In recent years, the stability of off-policy learning for value-based reinforcement learning has been guaranteed even when combined with linear function approximation and bootstrapping. Convergence rate analysis is currently a hot topic. However, the convergence rates of learning algorithms vary, and analyzing the reasons behind this remains an open problem. In this paper, we propose an essentially simplified version of a convergence rate to generate general off-policy temporal difference learning algorithms. We emphasize that the primary determinant influencing convergence rate is the minimum eigenvalue of the key matrix. Furthermore, we conduct a comparative analysis of the influencing factor across various off-policy learning algorithms in diverse numerical scenarios. The experimental findings validate the proposed determinant, which serves as a benchmark for the design of more efficient learning algorithms.

引用

页数：19

共 50 条

[21] The rate of convergence for quadratic forms
Basalykas A.
Lithuanian Mathematical Journal, 1997, 37 (3) : 191 - 206
[22] THE RATE OF CONVERGENCE OF CONJUGATE GRADIENTS
VANDERSLUIS, A
VANDERVORST, HA
NUMERISCHE MATHEMATIK, 1986, 48 (05) : 543 - 560
[23] On the rate of convergence of iterated exponentials
Fuchang Gao
Lixing Han
Kenneth Schilling
Journal of Applied Mathematics and Computing, 2012, 39 (1-2) : 89 - 96
[24] On the Rate of Convergence of STSD Extremes
Lin, Fuming
Zhang, Xinhua
Peng, Zuoxiang
Jiang, Yingying
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (10) : 1795 - 1806
[25] Rate of Convergence to Limit Distribution
Boyarinov, R. N.
MOSCOW UNIVERSITY MATHEMATICS BULLETIN, 2011, 66 (02) : 70 - 76
[26] RATE OF CONVERGENCE FOR INVARIANCE PRINCIPLE
BOROVKOV, AA
TEORIYA VEROYATNOSTEI I YEYE PRIMENIYA, 1973, 18 (02): : 217 - 234
[27] On the convergence rate of the unscented transformation
Ahn, Kwang Woo
Chan, Kung-Sik
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2013, 65 (05) : 889 - 912
[28] Convergence rate for consensus with delays
Angelia Nedić
Asuman Ozdaglar
Journal of Global Optimization, 2010, 47 : 437 - 456
[29] The Convergence Rate of the MDM Algorithm
Lopez, Jorge
Dorronsoro, Jose R.
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[30] Convergence rate of McCormick relaxations
Bompadre, Agustin
Mitsos, Alexander
JOURNAL OF GLOBAL OPTIMIZATION, 2012, 52 (01) : 1 - 28

← 1 2 3 4 5 →