Explaining Deep Learning Models with Constrained Adversarial Examples

被引:17
|
作者
Moore, Jonathan [1 ]
Hammerla, Nils [1 ]
Watkins, Chris [2 ]
机构
[1] Babylon Hlth, London SW3 3DD, England
[2] Royal Holloway Univ London, Egham, Surrey, England
关键词
Explainable AI; Adversarial examples; Counerfactual explanations; INTERPRETABILITY;
D O I
10.1007/978-3-030-29908-8_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.
引用
收藏
页码:43 / 56
页数:14
相关论文
共 50 条
  • [1] Explaining the Rationale of Deep Learning Glaucoma Decisions with Adversarial Examples
    Chang, Jooyoung
    Lee, Jinho
    Ha, Ahnul
    Han, Young Soo
    Bak, Eunoo
    Choi, Seulggie
    Yun, Jae Moon
    Kang, Uk
    Shin, Il Hyung
    Shin, Joo Young
    Ko, Taehoon
    Seul, Ye
    Oh, K-Lok
    Park, Ki Ho
    Park, Sang Min
    OPHTHALMOLOGY, 2021, 128 (01) : 78 - 88
  • [2] Explaining Image Misclassification in Deep Learning via Adversarial Examples
    Haffar, Rami
    Jebreel, Najeeb Moharram
    Domingo-Ferrer, Josep
    Sanchez, David
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2021), 2021, 12898 : 323 - 334
  • [3] Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation
    Ozbulak, Utku
    Van Messem, Arnout
    De Neve, Wesley
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 300 - 308
  • [4] Metamorphic Detection of Adversarial Examples in Deep Learning Models With Affine Transformations
    Mekala, Rohan Reddy
    Magnusson, Gudjon Einar
    Porter, Adam
    Lindvall, Mikael
    Diep, Madeline
    2019 IEEE/ACM 4TH INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2019), 2019, : 55 - 62
  • [5] The Problem of the Adversarial Examples in Deep Learning
    Zhang S.-S.
    Zuo X.
    Liu J.-W.
    Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (08): : 1886 - 1904
  • [6] Analysing Adversarial Examples for Deep Learning
    Jung, Jason
    Akhtar, Naveed
    Hassan, Ghulam
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 585 - 592
  • [7] Feature-Based Adversarial Training for Deep Learning Models Resistant to Transferable Adversarial Examples
    Ryu, Gwonsang
    Choi, Daeseon
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (05) : 1039 - 1049
  • [8] Interpreting Adversarial Examples in Deep Learning: A Review
    Han, Sicong
    Lin, Chenhao
    Shen, Chao
    Wang, Qian
    Guan, Xiaohong
    ACM COMPUTING SURVEYS, 2023, 55 (14S)
  • [9] Adversarial Examples: Attacks and Defenses for Deep Learning
    Yu, Xiaoyong
    He, Pan
    Zhu, Qile
    Li, Xiaolin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (09) : 2805 - 2824
  • [10] CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples
    Yu, Honggang
    Yang, Kaichen
    Zhang, Teng
    Tsai, Yun-Yun
    Ho, Tsung-Yi
    Jin, Yier
    27TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2020), 2020,