Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis

被引:0
|
作者
Nitanda, Atsushi [1 ,2 ]
Wu, Denny [3 ,4 ]
Suzuki, Taiji [2 ,5 ]
机构
[1] Kyushu Inst Technol, Kitakyushu, Fukuoka, Japan
[2] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan
[3] Univ Toronto, Toronto, ON, Canada
[4] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[5] Univ Tokyo, Tokyo, Japan
基金
加拿大自然科学与工程研究理事会;
关键词
LOGARITHMIC SOBOLEV INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the particle dual averaging (PDA) method, which generalizes the dual averaging method in convex optimization to the optimization over probability distributions with quantitative runtime guarantee. The algorithm consists of an inner loop and outer loop: the inner loop utilizes the Langevin algorithm to approximately solve for a stationary distribution, which is then optimized in the outer loop. The method can thus be interpreted as an extension of the Langevin algorithm to naturally handle nonlinear functional on the probability space. An important application of the proposed method is the optimization of neural network in the mean field regime, which is theoretically attractive due to the presence of nonlinear feature learning, but quantitative convergence rate can be challenging to obtain. By adapting finite-dimensional convex optimization theory into the space of measures, we analyze PDA in regularized empirical / expected risk minimization, and establish quantitative global convergence in learning two-layer mean field neural networks under more general settings. Our theoretical results are supported by numerical simulations on neural networks with reasonable size.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Towards an O(1/t) convergence rate for distributed dual averaging
    Liu, Changxin
    Li, Huiping
    Shi, Yang
    IFAC PAPERSONLINE, 2020, 53 (02): : 3254 - 3259
  • [12] Analysis on the Convergence Time of Dual Neural Network-Based kWTA
    Xiao, Yi
    Liu, Yuxin
    Leung, Chi-Sing
    Sum, John Pui-Fai
    Ho, Kevin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (04) : 676 - 682
  • [13] Modelling and optimization of mean thickness of backward flow formed tubes using regression analysis, particle swarm optimization and neural network
    Banerjee, Prabas
    Hui, Nirmal Baran
    Dikshit, Mithilesh K.
    Laha, Rupam
    Das, Sandeep
    SN APPLIED SCIENCES, 2020, 2 (08):
  • [14] Modelling and optimization of mean thickness of backward flow formed tubes using regression analysis, particle swarm optimization and neural network
    Prabas Banerjee
    Nirmal Baran Hui
    Mithilesh K. Dikshit
    Rupam Laha
    Sandeep Das
    SN Applied Sciences, 2020, 2
  • [15] Global Optimization of Neural Network
    Lee, Chae Young
    Lim, Yeon Jun
    Yoon, Taeseon
    2018 20TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT), 2018, : 25 - 28
  • [16] Neural network for global optimization
    Zhao, Hua-Min
    Chen, Kai-Zhou
    Kongzhi Lilun Yu Yinyong/Control Theory and Applications, 2002, 19 (06):
  • [17] The Research on BP Neural Network Model Based on Guaranteed Convergence Particle Swarm Optimization
    Tang, Pingzhou
    Xi, Zhaocai
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL II, PROCEEDINGS, 2008, : 13 - +
  • [18] Solving optimization problems: Mean field annealing theory and Hopfield neural network
    Shankar, R
    Krishna, RB
    KNOWLEDGE-BASED INTELLIGENT INFORMATION ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, PTS 1 AND 2, 2001, 69 : 1439 - 1447
  • [19] ON THE OPTIMAL RATE FOR THE CONVERGENCE PROBLEM IN MEAN FIELD CONTROL
    Daudin, Samuel
    Delarue, François
    Jackson, Joe
    arXiv, 2023,
  • [20] On the optimal rate for the convergence problem in mean field control
    Daudin, Samuel
    Delarue, Francois
    Jackson, Joe
    JOURNAL OF FUNCTIONAL ANALYSIS, 2024, 287 (12)