Concept whitening for interpretable image recognition

被引：151

作者：

Chen, Zhi ^{[1
]}

Bei, Yijie ^{[2
]}

Rudin, Cynthia ^{[1
,2
]}

机构：

[1] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA

[2] Duke Univ, Dept Elect & Comp Engn, Durham, NC USA

来源：

NATURE MACHINE INTELLIGENCE | 2020年 / 2卷 / 12期

基金：

美国国家科学基金会;

关键词：

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important; but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading; unusable or rely on the latent space to possess properties that it may not have. Here; rather than attempting to analyse a neural network post hoc; we introduce a mechanism; called concept whitening (CW); to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network; the latent space is whitened (that is; decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment; we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes; and also decorrelates (whitens); the latent space. CW can be used in any layer of the network without hurting predictive performance. © 2020; The Author(s); under exclusive licence to Springer Nature Limited;

D O I：

10.1038/s42256-020-00265-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading, unusable or rely on the latent space to possess properties that it may not have. Here, rather than attempting to analyse a neural network post hoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network, the latent space is whitened (that is, decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment, we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes, and also decorrelates (whitens), the latent space. CW can be used in any layer of the network without hurting predictive performance. There is much interest in 'explainable' AI, but most efforts concern post hoc methods. Instead, a neural network can be made inherently interpretable, with an approach that involves making human-understandable concepts (aeroplane, bed, lamp and so on) align along the axes of its latent space.

引用

页码：772 / 782

页数：12

共 50 条

[31] VICE: Variational Interpretable Concept Embeddings
Muttenthaler, Lukas
Zheng, Charles Y.
McClure, Patrick
Vandermeulen, Robert A.
Hebart, Martin N.
Pereira, Francisco
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[32] Learning Interpretable Concept Groups in CNNs
Varshneya, Saurabh
Ledent, Antoine
Vandermeulen, Robert A.
Lei, Yunwen
Enders, Matthias
Borth, Damian
Kloft, Marius
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1061 - 1067
[33] Interpretable Optimization Training Strategy-Based DCNN and Its Application on CT Image Recognition
Wang, Ronghan
Liu, Tao
Lu, Junwei
Zhou, Yuwei
MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
[34] Color pattern recognition with circular component whitening
Moreno, I
Kober, V
Lashin, V
Campos, J
Yaroslavsky, LP
Yzuel, MJ
OPTICS LETTERS, 1996, 21 (07) : 498 - 500
[35] SAR-AD-BagNet: An Interpretable Model for SAR Image Recognition Based on Adversarial Defense
Li, Peng
Hu, Xiaowei
Feng, Cunqian
Shi, Xiaozhen
Guo, Yiduo
Feng, Weike
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[36] SAR-AD-BagNet: An Interpretable Model for SAR Image Recognition Based on Adversarial Defense
Li, Peng
Hu, Xiaowei
Feng, Cunqian
Shi, Xiaozhen
Guo, Yiduo
Feng, Weike
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[37] The negative effects of whitening transformation in face recognition
Song, Fengxi
Zhang, David
SECOND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 2, PROCEEDINGS, 2009, : 437 - +
[38] Whitening preprocessing of color components for pattern recognition
Moreno, I
Kober, V
Lashin, V
Campos, J
Yaroslavsky, LP
Yzuel, MJ
SECOND IBEROAMERICAN MEETING ON OPTICS, 1996, 2730 : 617 - 621
[39] Interpretable Gait Recognition by Granger Causality
Balazia, Michal
Hlavackova-Schindler, Katerina
Sojka, Petr
Plant, Claudia
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1069 - 1075
[40] Image enhancement with wavelet-optimized whitening
Auchere, F.
Soubrie, E.
Pelouze, G.
Buchlin, E.
ASTRONOMY & ASTROPHYSICS, 2023, 670

← 1 2 3 4 5 →