Explaining Deep Learning Models with Constrained Adversarial Examples

被引：17

作者：

Moore, Jonathan ^{[1
]}

Hammerla, Nils ^{[1
]}

Watkins, Chris ^{[2
]}

机构：

[1] Babylon Hlth, London SW3 3DD, England

[2] Royal Holloway Univ London, Egham, Surrey, England

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I | 2019年 / 11670卷

关键词：

Explainable AI; Adversarial examples; Counerfactual explanations; INTERPRETABILITY;

D O I：

10.1007/978-3-030-29908-8_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.

引用

页码：43 / 56

页数：14

共 50 条

[41] Deep learning models for electrocardiograms are susceptible to adversarial attack
Han, Xintian
Hu, Yuxuan
Foschini, Luca
Chinitz, Larry
Jankelson, Lior
Ranganath, Rajesh
NATURE MEDICINE, 2020, 26 (03) : 360 - +
[42] Defending Deep Learning Models Against Adversarial Attacks
Mani, Nag
Moh, Melody
Moh, Teng-Sheng
INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2021, 13 (01): : 72 - 89
[43] Adversarial attacks on deep learning models in smart grids
Hao, Jingbo
Tao, Yang
ENERGY REPORTS, 2022, 8 : 123 - 129
[44] On the Robustness of Deep Learning Models to Universal Adversarial Attack
Karim, Rezaul
Islam, Md Amirul
Mohammed, Noman
Bruce, Neil D. B.
2018 15TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2018, : 55 - 62
[45] Robust Adversarial Objects against Deep Learning Models
Tsai, Tzungyu
Yang, Kaichen
Ho, Tsung-Yi
Jin, Yier
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 954 - 962
[46] Deep learning models for electrocardiograms are susceptible to adversarial attack
Xintian Han
Yuxuan Hu
Luca Foschini
Larry Chinitz
Lior Jankelson
Rajesh Ranganath
Nature Medicine, 2020, 26 : 360 - 363
[47] Safe batch constrained deep reinforcement learning with generative adversarial network
Dong, Wenbo
Liu, Shaofan
Sun, Shiliang
INFORMATION SCIENCES, 2023, 634 : 259 - 270
[48] Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples
Cisse, Moustapha
Adi, Yossi
Neverova, Natalia
Keshet, Joseph
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[49] Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub Humanoid
Melis, Marco
Demontis, Ambra
Biggio, Battista
Brown, Gavin
Fumera, Giorgio
Roli, Fabio
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 751 - 759
[50] Using Adversarial Examples to Bypass Deep Learning Based URL Detection System
Chen, Wencheng
Zeng, Yi
Qiu, Meikang
4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 128 - 130

← 1 2 3 4 5 →