Improving the Interpretability of Deep Neural Networks with Knowledge Distillation

被引:70
|
作者
Liu, Xuan [1 ]
Wang, Xiaoguang [1 ,2 ]
Matwin, Stan [1 ,3 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Inst Big Data Analyt, Halifax, NS, Canada
[2] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China
[3] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland
关键词
interpretation; Neural Networks; Decision Tree; TensorFlow; dark knowledge; knowledge distillation;
D O I
10.1109/ICDMW.2018.00132
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling, computer vision to speech recognition. However, nowadays, good performance alone is not enough to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it hard to understand and reason the predictions, which hinders its further progress. To tackle this problem, we apply the Knowledge Distillation technique to distill Deep Neural Networks into decision trees in order to attain good performance and interpretability simultaneously. We formulate the problem at hand as a multi-output regression problem and the experiments demonstrate that the student model achieves significantly better accuracy performance (about 1% to 5%) than vanilla decision trees at the same level of tree depth. The experiments are implemented on the TensorFlow platform to make it scalable to big datasets. To the best of our knowledge, we are the first to distill Deep Neural Networks into vanilla decision trees on multi-class datasets.
引用
收藏
页码:905 / 912
页数:8
相关论文
共 50 条
  • [41] Relevance aggregation for neural networks interpretability and knowledge discovery on tabular data
    Grisci, Bruno Iochins
    Krause, Mathias J.
    Dorn, Marcio
    [J]. INFORMATION SCIENCES, 2021, 559 : 111 - 129
  • [42] MULTI-TEACHER KNOWLEDGE DISTILLATION FOR COMPRESSED VIDEO ACTION RECOGNITION ON DEEP NEURAL NETWORKS
    Wu, Meng-Chieh
    Chiu, Ching-Te
    Wu, Kun-Hsuan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2202 - 2206
  • [43] Zero-Shot Knowledge Distillation in Deep Networks
    Nayak, Gaurav Kumar
    Mopuri, Konda Reddy
    Shaj, Vaisakh
    Babu, R. Venkatesh
    Chakraborty, Anirban
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [44] Knowledge Distillation Based on Narrow-Deep Networks
    Zhou, Yan
    Wang, Zhiqiang
    Li, Jianxun
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (03)
  • [45] RELIANT: Fair Knowledge Distillation for Graph Neural Networks
    Dong, Yushun
    Zhang, Binchi
    Yuan, Yiling
    Zou, Na
    Wang, Qi
    Li, Jundong
    [J]. PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 154 - +
  • [46] Explaining Neural Networks Using Attentive Knowledge Distillation
    Lee, Hyeonseok
    Kim, Sungchan
    [J]. SENSORS, 2021, 21 (04) : 1 - 17
  • [47] PDD: Pruning Neural Networks During Knowledge Distillation
    Dan, Xi
    Yang, Wenjie
    Zhang, Fuyan
    Zhou, Yihang
    Yu, Zhuojun
    Qiu, Zhen
    Zhao, Boyuan
    Dong, Zeyu
    Huang, Libo
    Yang, Chuanguang
    [J]. COGNITIVE COMPUTATION, 2024, : 3457 - 3467
  • [48] Online adversarial knowledge distillation for graph neural networks
    Wang, Can
    Wang, Zhe
    Chen, Defang
    Zhou, Sheng
    Feng, Yan
    Chen, Chun
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [49] Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
    Kushawaha, Ravi Kumar
    Kumar, Saurabh
    Banerjee, Biplab
    Velmurugan, Rajbabu
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4536 - 4543
  • [50] Interpretability of deep neural networks used for the diagnosis of Alzheimer's disease
    Pohl, Tomas
    Jakab, Marek
    Benesova, Wanda
    [J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2022, 32 (02) : 673 - 686