Variance Reduction for Evolutionary Strategies via Structured Control Variates

被引：0

作者：

Tang, Yunhao ^{[1
]}

Choromanski, Krzysztof ^{[2
]}

Kucukelbir, Alp ^{[1
,3
]}

机构：

[1] Columbia Univ, New York, NY 10027 USA

[2] Google Robot, Mountain View, CA USA

[3] Fero Labs, New York, NY USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108 | 2020年 / 108卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Evolution strategies (ES) are a powerful class of blackbox optimization techniques that recently became a competitive alternative to state-of-the-art policy gradient (PG) algorithms for reinforcement learning (RL). We propose a new method for improving accuracy of the ES algorithms, that as opposed to recent approaches utilizing only Monte Carlo structure of the gradient estimator, takes advantage of the underlying Markov decision process (MDP) structure to reduce the variance. We observe that the gradient estimator of the ES objective can be alternatively computed using reparametrization and PG estimators, which leads to new control variate techniques for gradient estimation in ES optimization. We provide theoretical insights and show through extensive experiments that this RL-specific variance reduction approach outperforms general purpose variance reduction methods.

引用

页数：10

共 50 条

[1] Neural Control Variates for Monte Carlo Variance Reduction
Wan, Ruosi
Zhong, Mingjun
Xiong, Haoyi
Zhu, Zhanxing
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 533 - 547
[2] Control Variates as a Variance Reduction Technique for Random Projections
Kang, Keegan
Hooker, Giles
[J]. PATTERN RECOGNITION APPLICATIONS AND METHODS, 2018, 10857 : 1 - 20
[3] AUTOMATED ESTIMATION AND VARIANCE REDUCTION VIA CONTROL VARIATES FOR INFINITE-HORIZON SIMULATIONS
ANONUEVO, R
NELSON, BL
[J]. COMPUTERS & OPERATIONS RESEARCH, 1988, 15 (05) : 447 - 456
[4] Meta-learning Control Variates: Variance Reduction with Limited Data
Sun, Zhuo
Oates, Chris J.
Briol, Francois-Xavier
[J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2047 - 2057
[5] Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods
Cheng, Ching-An
Yan, Xinyan
Boots, Byron
[J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[6] Selection of control variates for variance reduction in a multiresponse simulation with a small number of replications
Caris, A
Janssens, GK
[J]. MODELLING AND SIMULATION 2005, 2005, : 19 - 24
[7] MULTISCALE VARIANCE REDUCTION METHODS BASED ON MULTIPLE CONTROL VARIATES FOR KINETIC EQUATIONS WITH UNCERTAINTIES
Dimarco, Giacomo
Pareschi, Lorenzo
[J]. MULTISCALE MODELING & SIMULATION, 2020, 18 (01): : 351 - 382
[8] Variance reduction for Markov chain processes using state space evaluation for control variates
Dahl, FA
[J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2001, 52 (12) : 1402 - 1407
[9] The Efficiency of Variance Reduction in Manufacturing and Service Systems: The Comparison of the Control Variates and Stratified Sampling
Eraslan, Erguen
Dengiz, Berna
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2009, 2009
[10] Regularized Zero-Variance Control Variates
South, L. F.
Oates, C. J.
Mira, A.
Drovandi, C.
[J]. BAYESIAN ANALYSIS, 2023, 18 (03): : 865 - 888

← 1 2 3 4 5 →