On Representation Knowledge Distillation for Graph Neural Networks

被引：8

作者：

Joshi, Chaitanya K. ^{[1
,2
]}

Liu, Fayao ^{[1
]}

Xun, Xu ^{[1
]}

Lin, Jie ^{[1
]}

Foo, Chuan Sheng ^{[1
,3
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore

[2] Univ Cambridge, Dept Comp Sci & Technol, Cambridge CB2 1TN, England

[3] Ctr Frontier AI Res A STAR, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 04期

关键词：

Topology; Task analysis; Point cloud compression; Graph neural networks; Benchmark testing; Training; Social networking (online); Contrastive learning; geometric deep learning; graph neural networks (GNNs); knowledge distillation (KD);

D O I：

10.1109/TNNLS.2022.3223018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation (KD) is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the local structure preserving (LSP) loss, which matches local structural relationships defined over edges across the student and teacher's node embeddings. This article studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose graph contrastive representation distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across four datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving (GSP) variant of LSP) as well as baselines from 2-D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other.

引用

页码：4656 / 4667

页数：12

共 50 条

[1] Graph-Free Knowledge Distillation for Graph Neural Networks
Deng, Xiang
Zhang, Zhongfei
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2321 - 2327
[2] RELIANT: Fair Knowledge Distillation for Graph Neural Networks
Dong, Yushun
Zhang, Binchi
Yuan, Yiling
Zou, Na
Wang, Qi
Li, Jundong
[J]. PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 154 - +
[3] Online adversarial knowledge distillation for graph neural networks
Wang, Can
Wang, Zhe
Chen, Defang
Zhou, Sheng
Feng, Yan
Chen, Chun
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[4] Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks
Wu, Lirong
Lin, Haitao
Huang, Yufei
Li, Stan Z.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[5] Accelerating Molecular Graph Neural Networks via Knowledge Distillation
Kelvinius, Filip Ekstrom
Georgiev, Dimitar
Toshev, Artur Petrov
Gasteiger, Johannes
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36, NEURIPS 2023, 2023,
[6] Knowledge Distillation with Graph Neural Networks for Epileptic Seizure Detection
Zheng, Qinyue
Venkitaraman, Arun
Petravic, Simona
Frossard, Pascal
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VI, 2023, 14174 : 547 - 563
[7] Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks
Yang, Chenxiao
Wu, Qitian
Yan, Junchi
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[8] Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Guo, Zhichun
Zhang, Chunhui
Fan, Yujie
Tian, Yijun
Zhang, Chuxu
Chawla, Nitesh V.
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7793 - 7801
[9] FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks
Feng, Kaituo
Li, Changsheng
Yuan, Ye
Wang, Guoren
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 357 - 366
[10] A Lightweight Method for Graph Neural Networks Based on Knowledge Distillation and Graph Contrastive Learning
Wang, Yong
Yang, Shuqun
[J]. APPLIED SCIENCES-BASEL, 2024, 14 (11):

← 1 2 3 4 5 →