Improving the Interpretability of Deep Neural Networks with Knowledge Distillation

被引：70

作者：

Liu, Xuan ^{[1
]}

Wang, Xiaoguang ^{[1
,2
]}

Matwin, Stan ^{[1
,3
]}

机构：

[1] Dalhousie Univ, Fac Comp Sci, Inst Big Data Analyt, Halifax, NS, Canada

[2] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China

[3] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland

来源：

2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW) | 2018年

关键词：

interpretation; Neural Networks; Decision Tree; TensorFlow; dark knowledge; knowledge distillation;

D O I：

10.1109/ICDMW.2018.00132

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling, computer vision to speech recognition. However, nowadays, good performance alone is not enough to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it hard to understand and reason the predictions, which hinders its further progress. To tackle this problem, we apply the Knowledge Distillation technique to distill Deep Neural Networks into decision trees in order to attain good performance and interpretability simultaneously. We formulate the problem at hand as a multi-output regression problem and the experiments demonstrate that the student model achieves significantly better accuracy performance (about 1% to 5%) than vanilla decision trees at the same level of tree depth. The experiments are implemented on the TensorFlow platform to make it scalable to big datasets. To the best of our knowledge, we are the first to distill Deep Neural Networks into vanilla decision trees on multi-class datasets.

引用

页码：905 / 912

页数：8

共 50 条

[41] Relevance aggregation for neural networks interpretability and knowledge discovery on tabular data
Grisci, Bruno Iochins
Krause, Mathias J.
Dorn, Marcio
[J]. INFORMATION SCIENCES, 2021, 559 : 111 - 129
[42] MULTI-TEACHER KNOWLEDGE DISTILLATION FOR COMPRESSED VIDEO ACTION RECOGNITION ON DEEP NEURAL NETWORKS
Wu, Meng-Chieh
Chiu, Ching-Te
Wu, Kun-Hsuan
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2202 - 2206
[43] Zero-Shot Knowledge Distillation in Deep Networks
Nayak, Gaurav Kumar
Mopuri, Konda Reddy
Shaj, Vaisakh
Babu, R. Venkatesh
Chakraborty, Anirban
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[44] Knowledge Distillation Based on Narrow-Deep Networks
Zhou, Yan
Wang, Zhiqiang
Li, Jianxun
[J]. NEURAL PROCESSING LETTERS, 2024, 56 (03)
[45] RELIANT: Fair Knowledge Distillation for Graph Neural Networks
Dong, Yushun
Zhang, Binchi
Yuan, Yiling
Zou, Na
Wang, Qi
Li, Jundong
[J]. PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 154 - +
[46] Explaining Neural Networks Using Attentive Knowledge Distillation
Lee, Hyeonseok
Kim, Sungchan
[J]. SENSORS, 2021, 21 (04) : 1 - 17
[47] PDD: Pruning Neural Networks During Knowledge Distillation
Dan, Xi
Yang, Wenjie
Zhang, Fuyan
Zhou, Yihang
Yu, Zhuojun
Qiu, Zhen
Zhao, Boyuan
Dong, Zeyu
Huang, Libo
Yang, Chuanguang
[J]. COGNITIVE COMPUTATION, 2024, : 3457 - 3467
[48] Online adversarial knowledge distillation for graph neural networks
Wang, Can
Wang, Zhe
Chen, Defang
Zhou, Sheng
Feng, Yan
Chen, Chun
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[49] Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
Kushawaha, Ravi Kumar
Kumar, Saurabh
Banerjee, Biplab
Velmurugan, Rajbabu
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4536 - 4543
[50] Interpretability of deep neural networks used for the diagnosis of Alzheimer's disease
Pohl, Tomas
Jakab, Marek
Benesova, Wanda
[J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2022, 32 (02) : 673 - 686

← 1 2 3 4 5 →