Concept whitening for interpretable image recognition

被引：151

作者：

Chen, Zhi ^{[1
]}

Bei, Yijie ^{[2
]}

Rudin, Cynthia ^{[1
,2
]}

机构：

[1] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA

[2] Duke Univ, Dept Elect & Comp Engn, Durham, NC USA

来源：

NATURE MACHINE INTELLIGENCE | 2020年 / 2卷 / 12期

基金：

美国国家科学基金会;

关键词：

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important; but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading; unusable or rely on the latent space to possess properties that it may not have. Here; rather than attempting to analyse a neural network post hoc; we introduce a mechanism; called concept whitening (CW); to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network; the latent space is whitened (that is; decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment; we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes; and also decorrelates (whitens); the latent space. CW can be used in any layer of the network without hurting predictive performance. © 2020; The Author(s); under exclusive licence to Springer Nature Limited;

D O I：

10.1038/s42256-020-00265-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading, unusable or rely on the latent space to possess properties that it may not have. Here, rather than attempting to analyse a neural network post hoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network, the latent space is whitened (that is, decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment, we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes, and also decorrelates (whitens), the latent space. CW can be used in any layer of the network without hurting predictive performance. There is much interest in 'explainable' AI, but most efforts concern post hoc methods. Instead, a neural network can be made inherently interpretable, with an approach that involves making human-understandable concepts (aeroplane, bed, lamp and so on) align along the axes of its latent space.

引用

页码：772 / 782

页数：12

共 50 条

[21] Explainable AI in drug discovery: self-interpretable graph neural network for molecular property prediction using concept whitening
Michela Proietti
Alessio Ragno
Biagio La Rosa
Rino Ragno
Roberto Capobianco
Machine Learning, 2024, 113 : 2013 - 2044
[22] These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition
Singh, Gurmail
Yow, Kin-Choong
IEEE ACCESS, 2021, 9 (09): : 41482 - 41493
[23] Parkinson's Disease Recognition Using SPECT Image and Interpretable AI: A Tutorial
Pianpanit, Theerasarn
Lolak, Sermkiat
Sawangjai, Phattarapong
Sudhawiyangkul, Thapanun
Wilaiprasitporn, Theerawit
IEEE SENSORS JOURNAL, 2021, 21 (20) : 22304 - 22316
[24] Interpretable multimodal emotion recognition using hybrid fusion of speech and image data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
Multimedia Tools and Applications, 2024, 83 : 28373 - 28394
[25] Interpretable multimodal emotion recognition using hybrid fusion of speech and image data
Kumar, Puneet
Malik, Sarthak
Raman, Balasubramanian
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (10) : 28373 - 28394
[26] A biologically-inspired concept for active image recognition
Suri, RE
INTERNATIONAL CONFERENCE ON INTEGRATION OF KNOWLEDGE INTENSIVE MULTI-AGENT SYSTEMS: KIMAS'03: MODELING, EXPLORATION, AND ENGINEERING, 2003, : 379 - 384
[27] Interpretable Image Recognition by Screening Class-Specific and Class-Shared Prototypes
Li, Xiaomeng
Wang, Jiaqi
Jing, Liping
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 397 - 408
[28] Interpretable Dimensionality Reduction in 3D Image Recognition with Small Sample Sizes
Ivan Koptev
Jiacheng Tian
Eddie Peel
Rachel Barker
Cameron Walker
Andreas W. Kempa-Liehr
Journal of Nondestructive Evaluation, 2025, 44 (2)
[29] Towards Interpretable Face Recognition
Yin, Bangjie
Tran, Luan
Li, Haoxiang
Shen, Xiaohui
Liu, Xiaoming
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9347 - 9356
[30] Recognition of attentive objects with a concept association network for image annotation
Fu, Hong
Chi, Zheru
Feng, Dagan
PATTERN RECOGNITION, 2010, 43 (10) : 3539 - 3547

← 1 2 3 4 5 →