Stochastic Variance Reduction Methods for Policy Evaluation

被引:0
|
作者
Du, Simon S. [1 ]
Chen, Jianshu [2 ]
Li, Lihong [2 ]
Xiao, Lin [2 ]
Zhou, Dengyong [2 ]
机构
[1] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
[2] Microsoft Res, Redmond, WA 98052 USA
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy evaluation is concerned with estimating the value function that predicts long-term values of states under a given policy. It is a crucial step in many reinforcement-learning algorithms. In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset. We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle-point problem, and then present a primal-dual batch gradient method, as well as two stochastic variance reduction methods for solving the problem. These algorithms scale linearly in both sample size and feature dimension. Moreover, they achieve linear convergence even when the saddle-point problem has only strong concavity in the dual variables but no strong convexity in the primal variables. Numerical experiments on benchmark problems demonstrate the effectiveness of our methods.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Some variance reduction methods for numerical stochastic homogenization
    Blanc, X.
    Le Bris, C.
    Legoll, F.
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2066):
  • [2] Stochastic EM methods with variance reduction for penalised PET reconstructions
    Kereta, Zeljko
    Twyman, Robert
    Arridge, Simon
    Thielemans, Kris
    Jin, Bangti
    [J]. INVERSE PROBLEMS, 2021, 37 (11)
  • [3] Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization
    Liu, Jin
    Pan, Xiaokang
    Duan, Junwen
    Li, Hong-Dong
    Li, Youqi
    Qu, Zhe
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13927 - 13935
  • [4] Stochastic Variance Reduction Methods for Saddle-Point Problems
    Balamurugan, P.
    Bach, Francis
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [5] Reliability evaluation of the stochastic network using Variance Reduction Techniques
    Kim, Won Kyung
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCES, 2006, 5 : 348 - 352
  • [6] Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction
    Dragomir, Radu-Alexandru
    Even, Mathieu
    Hendrikx, Hadrien
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction
    Zou, Difan
    Xu, Pan
    Gu, Quanquan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] SAAGs: Biased stochastic variance reduction methods for large-scale learning
    Vinod Kumar Chauhan
    Anuj Sharma
    Kalpana Dahiya
    [J]. Applied Intelligence, 2019, 49 : 3331 - 3361
  • [9] Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
    Robert M. Gower
    Peter Richtárik
    Francis Bach
    [J]. Mathematical Programming, 2021, 188 : 135 - 192
  • [10] Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
    Gower, Robert M.
    Richtarik, Peter
    Bach, Francis
    [J]. MATHEMATICAL PROGRAMMING, 2021, 188 (01) : 135 - 192