On Representation Knowledge Distillation for Graph Neural Networks

被引:8
|
作者
Joshi, Chaitanya K. [1 ,2 ]
Liu, Fayao [1 ]
Xun, Xu [1 ]
Lin, Jie [1 ]
Foo, Chuan Sheng [1 ,3 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Univ Cambridge, Dept Comp Sci & Technol, Cambridge CB2 1TN, England
[3] Ctr Frontier AI Res A STAR, Singapore 138632, Singapore
关键词
Topology; Task analysis; Point cloud compression; Graph neural networks; Benchmark testing; Training; Social networking (online); Contrastive learning; geometric deep learning; graph neural networks (GNNs); knowledge distillation (KD);
D O I
10.1109/TNNLS.2022.3223018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation (KD) is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the local structure preserving (LSP) loss, which matches local structural relationships defined over edges across the student and teacher's node embeddings. This article studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose graph contrastive representation distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across four datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving (GSP) variant of LSP) as well as baselines from 2-D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other.
引用
收藏
页码:4656 / 4667
页数:12
相关论文
共 50 条
  • [31] Neural networks and structured knowledge: Knowledge representation and reasoning
    Kurfess, FJ
    [J]. APPLIED INTELLIGENCE, 1999, 11 (01) : 5 - 13
  • [32] Complex Representation Learning with Graph Convolutional Networks for Knowledge Graph Alignment
    Sakong, Darnbi
    Huynh, Thanh Trung
    Nguyen, Thanh Tam
    Nguyen, Thanh Toan
    Jo, Jun
    Nguyen, Quoc Viet Hung
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2023, 2023
  • [33] Explaining Neural Networks Using Attentive Knowledge Distillation
    Lee, Hyeonseok
    Kim, Sungchan
    [J]. SENSORS, 2021, 21 (04) : 1 - 17
  • [34] Knowledge Distillation for Optimization of Quantized Deep Neural Networks
    Shin, Sungho
    Boo, Yoonho
    Sung, Wonyong
    [J]. 2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, : 111 - 116
  • [35] PDD: Pruning Neural Networks During Knowledge Distillation
    Dan, Xi
    Yang, Wenjie
    Zhang, Fuyan
    Zhou, Yihang
    Yu, Zhuojun
    Qiu, Zhen
    Zhao, Boyuan
    Dong, Zeyu
    Huang, Libo
    Yang, Chuanguang
    [J]. COGNITIVE COMPUTATION, 2024, : 3457 - 3467
  • [36] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
    Liu, Xuan
    Wang, Xiaoguang
    Matwin, Stan
    [J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
  • [37] Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
    Kushawaha, Ravi Kumar
    Kumar, Saurabh
    Banerjee, Biplab
    Velmurugan, Rajbabu
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4536 - 4543
  • [38] Representation Learning of Knowledge Graph for Wireless Communication Networks
    He, Shiwen
    Ou, Yeyu
    Wang, Liangpeng
    Zhan, Hang
    Ren, Peng
    Huang, Yongming
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1338 - 1343
  • [39] Distilling Holistic Knowledge with Graph Neural Networks
    Zhou, Sheng
    Wang, Yucheng
    Chen, Defang
    Chen, Jiawei
    Wang, Xin
    Wang, Can
    Bu, Jiajun
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10367 - 10376
  • [40] Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology
    Dehmamy, Nima
    Barabasi, Albert-Laszlo
    Yu, Rose
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32