On Representation Knowledge Distillation for Graph Neural Networks

被引：8

作者：

Joshi, Chaitanya K. ^{[1
,2
]}

Liu, Fayao ^{[1
]}

Xun, Xu ^{[1
]}

Lin, Jie ^{[1
]}

Foo, Chuan Sheng ^{[1
,3
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore

[2] Univ Cambridge, Dept Comp Sci & Technol, Cambridge CB2 1TN, England

[3] Ctr Frontier AI Res A STAR, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 04期

关键词：

Topology; Task analysis; Point cloud compression; Graph neural networks; Benchmark testing; Training; Social networking (online); Contrastive learning; geometric deep learning; graph neural networks (GNNs); knowledge distillation (KD);

D O I：

10.1109/TNNLS.2022.3223018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation (KD) is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the local structure preserving (LSP) loss, which matches local structural relationships defined over edges across the student and teacher's node embeddings. This article studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose graph contrastive representation distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across four datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving (GSP) variant of LSP) as well as baselines from 2-D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other.

引用

页码：4656 / 4667

页数：12

共 50 条

[31] Neural networks and structured knowledge: Knowledge representation and reasoning
Kurfess, FJ
[J]. APPLIED INTELLIGENCE, 1999, 11 (01) : 5 - 13
[32] Complex Representation Learning with Graph Convolutional Networks for Knowledge Graph Alignment
Sakong, Darnbi
Huynh, Thanh Trung
Nguyen, Thanh Tam
Nguyen, Thanh Toan
Jo, Jun
Nguyen, Quoc Viet Hung
[J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2023, 2023
[33] Explaining Neural Networks Using Attentive Knowledge Distillation
Lee, Hyeonseok
Kim, Sungchan
[J]. SENSORS, 2021, 21 (04) : 1 - 17
[34] Knowledge Distillation for Optimization of Quantized Deep Neural Networks
Shin, Sungho
Boo, Yoonho
Sung, Wonyong
[J]. 2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, : 111 - 116
[35] PDD: Pruning Neural Networks During Knowledge Distillation
Dan, Xi
Yang, Wenjie
Zhang, Fuyan
Zhou, Yihang
Yu, Zhuojun
Qiu, Zhen
Zhao, Boyuan
Dong, Zeyu
Huang, Libo
Yang, Chuanguang
[J]. COGNITIVE COMPUTATION, 2024, : 3457 - 3467
[36] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
Liu, Xuan
Wang, Xiaoguang
Matwin, Stan
[J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
[37] Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
Kushawaha, Ravi Kumar
Kumar, Saurabh
Banerjee, Biplab
Velmurugan, Rajbabu
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4536 - 4543
[38] Representation Learning of Knowledge Graph for Wireless Communication Networks
He, Shiwen
Ou, Yeyu
Wang, Liangpeng
Zhan, Hang
Ren, Peng
Huang, Yongming
[J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1338 - 1343
[39] Distilling Holistic Knowledge with Graph Neural Networks
Zhou, Sheng
Wang, Yucheng
Chen, Defang
Chen, Jiawei
Wang, Xin
Wang, Can
Bu, Jiajun
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10367 - 10376
[40] Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology
Dehmamy, Nima
Barabasi, Albert-Laszlo
Yu, Rose
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →