CONTROLLING STOCHASTIC GRADIENT DESCENT USING STOCHASTIC APPROXIMATION FOR ROBUST DISTRIBUTED OPTIMIZATION

被引:0
|
作者
Jain, Adit [1 ]
Krishnamurthy, Vikram [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14850 USA
基金
美国国家科学基金会;
关键词
Stochastic approximation; distributed optimization; Markov decision processes; POLICIES;
D O I
10.3934/naco.2024041
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper deals with the problem of controlling the stochastic gradient descent, performed by multiple learners where the aim is to estimate the respective arg min f using noisy gradients obtained by querying a stochastic oracle. Each query has a learning cost, and the noisy gradient response has varying degrees of noise variance, the bound of which is assumed to vary in a Markovian fashion. For a single learner, the decision problem is to choose when to query the oracle such that the learning cost is minimized. A constrained Markov decision process (CMDP) is formulated to solve the decision problem of a single learner. Structural results are proven for the optimal policy for the CMDP, which is shown to be threshold decreasing in the queue state. For multiple learners, a constrained switching control game is formulated for scheduling and controlling N learners querying the same oracle, one at a time. The structural results are extended for the optimal policy achieving the Nash equilibrium. The structural results are used to propose a stochastic approximation algorithm to search for the optimal policy, which tracks the parameters of the optimal policy using a sigmoidal approximation and does not require knowledge of the underlying transition probabilities. The paper also briefly discusses applications in federated learning and numerically shows the convergence properties of the proposed algorithm.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
    Mitra, Partha P.
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1890 - 1894
  • [42] ON DISTRIBUTED STOCHASTIC GRADIENT ALGORITHMS FOR GLOBAL OPTIMIZATION
    Swenson, Brian
    Sridhar, Anirudh
    Poor, H. Vincent
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8594 - 8598
  • [43] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [44] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [45] RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION
    Lan, Guanghui
    Zhou, Yi
    SIAM JOURNAL ON OPTIMIZATION, 2018, 28 (04) : 2753 - 2782
  • [46] Stochastic gradient descent tricks
    Bottou, Léon
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
  • [47] Stochastic Reweighted Gradient Descent
    El Hanchi, Ayoub
    Stephens, David A.
    Maddison, Chris J.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [48] Byzantine Stochastic Gradient Descent
    Alistarh, Dan
    Allen-Zhu, Zeyuan
    Li, Jerry
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [49] A conjecture on global optimization using gradient-free stochastic approximation
    Maryak, JL
    Chin, DC
    JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 441 - 445
  • [50] Machine learning for inverse lithography: using stochastic gradient descent for robust photomask synthesis
    Jia, Ningning
    Lam, Edmund Y.
    JOURNAL OF OPTICS, 2010, 12 (04)