A Convergence Analysis of Gradient Descent on Graph Neural Networks

被引：0

作者：

Awasthi, Pranjal ^{[1
]}

Das, Abhimanyu ^{[1
]}

Gollapudi, Sreenivas ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Graph Neural Networks (GNNs) are a powerful class of architectures for solving learning problems on graphs. While many variants of GNNs have been proposed in the literature and have achieved strong empirical performance, their theoretical properties are less well understood. In this work we study the convergence properties of the gradient descent algorithm when used to train GNNs. In particular, we consider the realizable setting where the data is generated from a network with unknown weights and our goal is to study conditions under which gradient descent on a GNN architecture can recover near optimal solutions. While such analysis has been performed in recent years for other architectures such as fully connected feed-forward networks, the message passing nature of the updates in a GNN poses a new challenge in understanding the nature of the gradient descent updates. We take a step towards overcoming this by proving that for the case of deep linear GNNs gradient descent provably recovers solutions up to error is an element of in O(log(1/is an element of)) iterations, under natural assumptions on the data distribution. Furthermore, for the case of one-round GNNs with ReLU activations, we show that gradient descent provably recovers solutions up to error is an element of in O(1/is an element of(2) log(1/is an element of)) iterations.

引用

页数：13

共 50 条

[31] Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent
Xu, Jiaming
Zhu, Hanjing
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 83
[32] Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Frei, Spencer
Gu, Quanquan
[J]. arXiv, 2021,
[33] Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers
Paquin, Alexandre Lemire
Chaib-draa, Brahim
Giguere, Philippe
[J]. NEURAL NETWORKS, 2023, 164 : 382 - 394
[34] Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial Convergence and SQ Lower Bounds
Vempala, Santosh
Wilmes, John
[J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[35] Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Frei, Spencer
Gu, Quanquan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[36] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
Zhuo, Li'an
Zhang, Baochang
Chen, Chen
Ye, Qixiang
Liu, Jianzhuang
Doermann, David
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
[37] Gradient descent learning for quaternionic Hopfield neural networks
Kobayashi, Masaki
[J]. NEUROCOMPUTING, 2017, 260 : 174 - 179
[38] A gradient descent learning algorithm for fuzzy neural networks
Feuring, T
Buckley, JJ
Hayashi, Y
[J]. 1998 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AT THE IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE - PROCEEDINGS, VOL 1-2, 1998, : 1136 - 1141
[39] Understanding the Convolutional Neural Networks with Gradient Descent and Backpropagation
Zhou, XueFei
[J]. 2ND INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2018), 2018, 1004
[40] Fractional Gradient Descent Method for Spiking Neural Networks
Yang, Honggang
Chen, Jiejie
Jiang, Ping
Xu, Mengfei
Zhao, Haiming
[J]. 2023 2ND CONFERENCE ON FULLY ACTUATED SYSTEM THEORY AND APPLICATIONS, CFASTA, 2023, : 636 - 641

← 1 2 3 4 5 →