Explaining Deep Learning Models with Constrained Adversarial Examples

被引：17

作者：

Moore, Jonathan ^{[1
]}

Hammerla, Nils ^{[1
]}

Watkins, Chris ^{[2
]}

机构：

[1] Babylon Hlth, London SW3 3DD, England

[2] Royal Holloway Univ London, Egham, Surrey, England

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I | 2019年 / 11670卷

关键词：

Explainable AI; Adversarial examples; Counerfactual explanations; INTERPRETABILITY;

D O I：

10.1007/978-3-030-29908-8_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.

引用

页码：43 / 56

页数：14

共 50 条

[31] Adversarial examples for generative models
Kos, Jernej
Fischer, Ian
Song, Dawn
2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 36 - 42
[32] Adversarial Examples for Models of Code
Yefet, Noam
Alon, Uri
Yahav, Eran
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (04):
[33] Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples
Barredo-Arrieta, Alejandro
Del Ser, Javier
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[34] Experiments on Adversarial Examples for Deep Learning Model Using Multimodal Sensors
Kurniawan, Ade
Ohsita, Yuichi
Murata, Masayuki
SENSORS, 2022, 22 (22)
[35] ADVERSARIAL-PLAYGROUND: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning
Norton, Andrew P.
Qi, Yanjun
2017 IEEE SYMPOSIUM ON VISUALIZATION FOR CYBER SECURITY (VIZSEC), 2017,
[36] Detection of sensors used for adversarial examples against machine learning models
Kurniawan, Ade
Ohsita, Yuichi
Murata, Masayuki
Results in Engineering, 2024, 24
[37] Privacy Risks of Securing Machine Learning Models against Adversarial Examples
Song, Liwei
Shokri, Reza
Mittal, Prateek
PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 241 - 257
[38] EXPLAINING DEEP MODELS THROUGH FORGETTABLE LEARNING DYNAMICS
Benkert, Ryan
Aribido, Oluwaseun Joseph
AlRegib, Ghassan
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3692 - 3696
[39] Explaining Deep Learning-Based Driver Models
Lorente, Maria Paz Sesmero
Lopez, Elena Magan
Florez, Laura Alvarez
Espino, Agapito Ledezma
Martinez, Jose Antonio Iglesias
de Miguel, Araceli Sanchis
APPLIED SCIENCES-BASEL, 2021, 11 (08):
[40] Explaining Deep Learning Models for Low Vision Prognosis
Gui, Haiwen
Tseng, Benjamin
Hu, Wendeng
Wang, Sophia Y.
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2022, 63 (07)

← 1 2 3 4 5 →