Explaining Deep Learning Models with Constrained Adversarial Examples

被引：17

作者：

Moore, Jonathan ^{[1
]}

Hammerla, Nils ^{[1
]}

Watkins, Chris ^{[2
]}

机构：

[1] Babylon Hlth, London SW3 3DD, England

[2] Royal Holloway Univ London, Egham, Surrey, England

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I | 2019年 / 11670卷

关键词：

Explainable AI; Adversarial examples; Counerfactual explanations; INTERPRETABILITY;

D O I：

10.1007/978-3-030-29908-8_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.

引用

页码：43 / 56

页数：14

共 50 条

[1] Explaining the Rationale of Deep Learning Glaucoma Decisions with Adversarial Examples
Chang, Jooyoung
Lee, Jinho
Ha, Ahnul
Han, Young Soo
Bak, Eunoo
Choi, Seulggie
Yun, Jae Moon
Kang, Uk
Shin, Il Hyung
Shin, Joo Young
Ko, Taehoon
Seul, Ye
Oh, K-Lok
Park, Ki Ho
Park, Sang Min
OPHTHALMOLOGY, 2021, 128 (01) : 78 - 88
[2] Explaining Image Misclassification in Deep Learning via Adversarial Examples
Haffar, Rami
Jebreel, Najeeb Moharram
Domingo-Ferrer, Josep
Sanchez, David
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2021), 2021, 12898 : 323 - 334
[3] Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation
Ozbulak, Utku
Van Messem, Arnout
De Neve, Wesley
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 300 - 308
[4] Metamorphic Detection of Adversarial Examples in Deep Learning Models With Affine Transformations
Mekala, Rohan Reddy
Magnusson, Gudjon Einar
Porter, Adam
Lindvall, Mikael
Diep, Madeline
2019 IEEE/ACM 4TH INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2019), 2019, : 55 - 62
[5] The Problem of the Adversarial Examples in Deep Learning
Zhang S.-S.
Zuo X.
Liu J.-W.
Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (08): : 1886 - 1904
[6] Analysing Adversarial Examples for Deep Learning
Jung, Jason
Akhtar, Naveed
Hassan, Ghulam
VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 585 - 592
[7] Feature-Based Adversarial Training for Deep Learning Models Resistant to Transferable Adversarial Examples
Ryu, Gwonsang
Choi, Daeseon
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (05) : 1039 - 1049
[8] Interpreting Adversarial Examples in Deep Learning: A Review
Han, Sicong
Lin, Chenhao
Shen, Chao
Wang, Qian
Guan, Xiaohong
ACM COMPUTING SURVEYS, 2023, 55 (14S)
[9] Adversarial Examples: Attacks and Defenses for Deep Learning
Yu, Xiaoyong
He, Pan
Zhu, Qile
Li, Xiaolin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (09) : 2805 - 2824
[10] CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples
Yu, Honggang
Yang, Kaichen
Zhang, Teng
Tsai, Yun-Yun
Ho, Tsung-Yi
Jin, Yier
27TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2020), 2020,

← 1 2 3 4 5 →