A Convergence Analysis of Gradient Descent on Graph Neural Networks

被引:0
|
作者
Awasthi, Pranjal [1 ]
Das, Abhimanyu [1 ]
Gollapudi, Sreenivas [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph Neural Networks (GNNs) are a powerful class of architectures for solving learning problems on graphs. While many variants of GNNs have been proposed in the literature and have achieved strong empirical performance, their theoretical properties are less well understood. In this work we study the convergence properties of the gradient descent algorithm when used to train GNNs. In particular, we consider the realizable setting where the data is generated from a network with unknown weights and our goal is to study conditions under which gradient descent on a GNN architecture can recover near optimal solutions. While such analysis has been performed in recent years for other architectures such as fully connected feed-forward networks, the message passing nature of the updates in a GNN poses a new challenge in understanding the nature of the gradient descent updates. We take a step towards overcoming this by proving that for the case of deep linear GNNs gradient descent provably recovers solutions up to error is an element of in O(log(1/is an element of)) iterations, under natural assumptions on the data distribution. Furthermore, for the case of one-round GNNs with ReLU activations, we show that gradient descent provably recovers solutions up to error is an element of in O(1/is an element of(2) log(1/is an element of)) iterations.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent
    Xu, Jiaming
    Zhu, Hanjing
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 83
  • [32] Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
    Frei, Spencer
    Gu, Quanquan
    [J]. arXiv, 2021,
  • [33] Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers
    Paquin, Alexandre Lemire
    Chaib-draa, Brahim
    Giguere, Philippe
    [J]. NEURAL NETWORKS, 2023, 164 : 382 - 394
  • [34] Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial Convergence and SQ Lower Bounds
    Vempala, Santosh
    Wilmes, John
    [J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [35] Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
    Frei, Spencer
    Gu, Quanquan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [36] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
    Zhuo, Li'an
    Zhang, Baochang
    Chen, Chen
    Ye, Qixiang
    Liu, Jianzhuang
    Doermann, David
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
  • [37] Gradient descent learning for quaternionic Hopfield neural networks
    Kobayashi, Masaki
    [J]. NEUROCOMPUTING, 2017, 260 : 174 - 179
  • [38] A gradient descent learning algorithm for fuzzy neural networks
    Feuring, T
    Buckley, JJ
    Hayashi, Y
    [J]. 1998 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AT THE IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE - PROCEEDINGS, VOL 1-2, 1998, : 1136 - 1141
  • [39] Understanding the Convolutional Neural Networks with Gradient Descent and Backpropagation
    Zhou, XueFei
    [J]. 2ND INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2018), 2018, 1004
  • [40] Fractional Gradient Descent Method for Spiking Neural Networks
    Yang, Honggang
    Chen, Jiejie
    Jiang, Ping
    Xu, Mengfei
    Zhao, Haiming
    [J]. 2023 2ND CONFERENCE ON FULLY ACTUATED SYSTEM THEORY AND APPLICATIONS, CFASTA, 2023, : 636 - 641