Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis

被引:0
|
作者
Nitanda, Atsushi [1 ,2 ]
Wu, Denny [3 ,4 ]
Suzuki, Taiji [2 ,5 ]
机构
[1] Kyushu Inst Technol, Kitakyushu, Fukuoka, Japan
[2] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan
[3] Univ Toronto, Toronto, ON, Canada
[4] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[5] Univ Tokyo, Tokyo, Japan
基金
加拿大自然科学与工程研究理事会;
关键词
LOGARITHMIC SOBOLEV INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the particle dual averaging (PDA) method, which generalizes the dual averaging method in convex optimization to the optimization over probability distributions with quantitative runtime guarantee. The algorithm consists of an inner loop and outer loop: the inner loop utilizes the Langevin algorithm to approximately solve for a stationary distribution, which is then optimized in the outer loop. The method can thus be interpreted as an extension of the Langevin algorithm to naturally handle nonlinear functional on the probability space. An important application of the proposed method is the optimization of neural network in the mean field regime, which is theoretically attractive due to the presence of nonlinear feature learning, but quantitative convergence rate can be challenging to obtain. By adapting finite-dimensional convex optimization theory into the space of measures, we analyze PDA in regularized empirical / expected risk minimization, and establish quantitative global convergence in learning two-layer mean field neural networks under more general settings. Our theoretical results are supported by numerical simulations on neural networks with reasonable size.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Coordinate Dual Averaging for Decentralized Online Optimization With Nonseparable Global Objectives
    Lee, Soomin
    Nedic, Angelia
    Raginsky, Maxim
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2018, 5 (01): : 34 - 44
  • [42] A Local and Global Search Combined Particle Swarm Optimization Algorithm and Its Convergence Analysis
    Lin, Weitian
    Lian, Zhigang
    Gu, Xingsheng
    Jiao, Bin
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [43] Convergence rate of LQG mean field games with common noise
    Jian, Jiamin
    Song, Qingshuo
    Ye, Jiaxuan
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2024, 99 (03) : 233 - 270
  • [44] Quantum Fluctuations and Rate of Convergence Towards Mean Field Dynamics
    Rodnianski, Igor
    Schlein, Benjamin
    COMMUNICATIONS IN MATHEMATICAL PHYSICS, 2009, 291 (01) : 31 - 61
  • [45] Quantum Fluctuations and Rate of Convergence Towards Mean Field Dynamics
    Igor Rodnianski
    Benjamin Schlein
    Communications in Mathematical Physics, 2009, 291 : 31 - 61
  • [46] Mean-Field Dynamics: Singular Potentials and Rate of Convergence
    Knowles, Antti
    Pickl, Peter
    COMMUNICATIONS IN MATHEMATICAL PHYSICS, 2010, 298 (01) : 101 - 138
  • [47] Mean-Field Dynamics: Singular Potentials and Rate of Convergence
    Antti Knowles
    Peter Pickl
    Communications in Mathematical Physics, 2010, 298 : 101 - 138
  • [48] Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
    Kerimkulov, Bekzhan
    Leahy, James-Michael
    Siska, David
    Szpruch, Lukasz
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [49] Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues
    Mukherjee, Soumendu Sunder
    Sarkar, Purnamrita
    Wang, Y. X. Rachel
    Yan, Bowei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [50] A Review of Convergence Analysis of Particle Swarm Optimization
    Tian, Dong Ping
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (06): : 117 - 127