CONTROLLING STOCHASTIC GRADIENT DESCENT USING STOCHASTIC APPROXIMATION FOR ROBUST DISTRIBUTED OPTIMIZATION

被引:0
|
作者
Jain, Adit [1 ]
Krishnamurthy, Vikram [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14850 USA
基金
美国国家科学基金会;
关键词
Stochastic approximation; distributed optimization; Markov decision processes; POLICIES;
D O I
10.3934/naco.2024041
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper deals with the problem of controlling the stochastic gradient descent, performed by multiple learners where the aim is to estimate the respective arg min f using noisy gradients obtained by querying a stochastic oracle. Each query has a learning cost, and the noisy gradient response has varying degrees of noise variance, the bound of which is assumed to vary in a Markovian fashion. For a single learner, the decision problem is to choose when to query the oracle such that the learning cost is minimized. A constrained Markov decision process (CMDP) is formulated to solve the decision problem of a single learner. Structural results are proven for the optimal policy for the CMDP, which is shown to be threshold decreasing in the queue state. For multiple learners, a constrained switching control game is formulated for scheduling and controlling N learners querying the same oracle, one at a time. The structural results are extended for the optimal policy achieving the Nash equilibrium. The structural results are used to propose a stochastic approximation algorithm to search for the optimal policy, which tracks the parameters of the optimal policy using a sigmoidal approximation and does not require knowledge of the underlying transition probabilities. The paper also briefly discusses applications in federated learning and numerically shows the convergence properties of the proposed algorithm.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Robust Pose Graph Optimization Using Stochastic Gradient Descent
    Wang, John
    Olson, Edwin
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 4284 - 4289
  • [2] Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
    Vakili, Sattar
    Salgia, Sudeep
    Zhao, Qing
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 432 - 438
  • [3] Distributed Stochastic Gradient Descent Using LDGM Codes
    Horii, Shunsuke
    Yoshida, Takahiro
    Kobayashi, Manabu
    Matsushima, Toshiyasu
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 1417 - 1421
  • [4] A Novel Distributed Variant of Stochastic Gradient Descent and Its Optimization
    Wang, Yi-qi
    Zhao, Ya-wei
    Shi, Zhan
    Yin, Jian-ping
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER SCIENCE (AICS 2016), 2016, : 486 - 492
  • [5] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] On the diffusion approximation of nonconvex stochastic gradient descent
    Hu, Wenqing
    Li, Chris Junchi
    Li, Lei
    Liu, Jian-Guo
    ANNALS OF MATHEMATICAL SCIENCES AND APPLICATIONS, 2019, 4 (01) : 3 - 32
  • [7] Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning: Examining Distributed and Centralized Stochastic Gradient Descent
    Pu, Shi
    Olshevsky, Alex
    Paschalidis, Ioannis Ch.
    IEEE SIGNAL PROCESSING MAGAZINE, 2020, 37 (03) : 114 - 122
  • [8] Predicting Throughput of Distributed Stochastic Gradient Descent
    Li, Zhuojin
    Paolieri, Marco
    Golubchik, Leana
    Lin, Sung-Han
    Yan, Wumo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
  • [9] Distributed stochastic gradient descent with discriminative aggregating
    Chen, Zhen-Hong
    Lan, Yan-Yan
    Guo, Jia-Feng
    Cheng, Xue-Qi
    Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (10): : 2054 - 2063
  • [10] Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent
    Papa, Guillaume
    Bianchi, Pascal
    Clemencon, Stephan
    ALGORITHMIC LEARNING THEORY, ALT 2015, 2015, 9355 : 317 - 331