Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization

被引:0
|
作者
Chen Z. [1 ]
Mou S. [1 ]
Maguluri S.T. [1 ]
机构
[1] Georgia Institute of Technology, Atlanta, GA
来源
Performance Evaluation Review | 2022年 / 50卷 / 01期
基金
美国国家科学基金会;
关键词
asymptotic analysis; stationary distribution; stochastic approximation; stochastic gradient descent;
D O I
10.1145/3547353.3522659
中图分类号
学科分类号
摘要
Stochastic approximation (SA) and stochastic gradient descent (SGD) algorithms are work-horses for modern machine learning algorithms. Their constant stepsize variants are preferred in practice due to fast convergence behavior. However, constant stepsize SA algorithms do not converge to the optimal solution, but instead have a stationary distribution, which in general cannot be analytically characterized. In this work, we study the asymptotic behavior of the appropriately scaled stationary distribution, in the limit when the constant stepsize goes to zero. Specifically, we consider the following three settings: (1) SGD algorithm with a smooth and strongly convex objective, (2) linear SA algorithm involving a Hurwitz matrix, and (3) nonlinear SA algorithm involving a contractive operator. When the iterate is scaled by 1/√α, where α is the constant stepsize, we show that the limiting scaled stationary distribution is a solution of an implicit equation. Under a uniqueness assumption (which can be removed in certain settings) on this equation, we further characterize the limiting distribution as a Gaussian distribution whose covariance matrix is the unique solution of an appropriate Lyapunov equation. For SA algorithms beyond these cases, our numerical experiments suggest that unlike central limit theorem type results: (1) the scaling factor need not be 1/√α, and (2) the limiting distribution need not be Gaussian. Based on the numerical study, we come up with a heuristic formula to determine the right scaling factor, and make a connection to the Euler-Maruyama discretization scheme for approximating stochastic differential equations. © 2022 Copyright held by the owner/author(s).
引用
收藏
页码:109 / 110
页数:1
相关论文
共 50 条
  • [1] Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization
    Chen, Zaiwei
    Mou, Shancong
    Maguluri, Siva Theja
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (01)
  • [2] Benign Overfitting of Constant-Stepsize SGD for Linear Regression
    Zou, Difan
    Wu, Jingfeng
    Braverman, Vladimir
    Gu, Quanquan
    Kakade, Sham M.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [3] On the asymptotic behavior of a constant stepsize temporal-difference learning algorithm
    Tadic, A
    COMPUTATIONAL LEARNING THEORY, 1999, 1572 : 126 - 137
  • [4] The asymptotic behaviour of the θ-methods with constant stepsize for the generalized pantograph equation
    Zhang, Gengen
    Xiao, Aiguo
    Wang, Wansheng
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2016, 93 (09) : 1484 - 1504
  • [5] Convergence behavior of the constant modulus algorithm controlled by special stepsize
    Guo, L
    Li, N
    Guo, Y
    Zhou, JP
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 390 - 392
  • [6] On the asymptotic behavior of some algorithms
    Robert, P
    RANDOM STRUCTURES & ALGORITHMS, 2005, 27 (02) : 235 - 250
  • [7] ASYMPTOTIC BEHAVIOR OF PREDICTION ERROR OF A STATIONARY SEQUENCE WITH A SPECTRAL DENSITY OF SPECIAL TYPE
    IBRAGIMOV, IA
    SOLEV, VN
    THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1968, 13 (04): : 703 - +
  • [8] Non-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agents
    Bhumesh Kumar
    Vivek Borkar
    Akhil Shetty
    Mathematics of Control, Signals, and Systems, 2019, 31 : 589 - 614
  • [9] Non-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agents
    Kumar, Bhumesh
    Borkar, Vivek
    Shetty, Akhil
    MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 2019, 31 (04) : 589 - 614
  • [10] An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias
    Yu, Lu
    Balasubramanian, Krishnakumar
    Volgushev, Stanislav
    Erdogdu, Murat A.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34