Explaining Deep Learning Models with Constrained Adversarial Examples

被引:17
|
作者
Moore, Jonathan [1 ]
Hammerla, Nils [1 ]
Watkins, Chris [2 ]
机构
[1] Babylon Hlth, London SW3 3DD, England
[2] Royal Holloway Univ London, Egham, Surrey, England
关键词
Explainable AI; Adversarial examples; Counerfactual explanations; INTERPRETABILITY;
D O I
10.1007/978-3-030-29908-8_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.
引用
收藏
页码:43 / 56
页数:14
相关论文
共 50 条
  • [31] Adversarial examples for generative models
    Kos, Jernej
    Fischer, Ian
    Song, Dawn
    2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 36 - 42
  • [32] Adversarial Examples for Models of Code
    Yefet, Noam
    Alon, Uri
    Yahav, Eran
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (04):
  • [33] Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples
    Barredo-Arrieta, Alejandro
    Del Ser, Javier
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [34] Experiments on Adversarial Examples for Deep Learning Model Using Multimodal Sensors
    Kurniawan, Ade
    Ohsita, Yuichi
    Murata, Masayuki
    SENSORS, 2022, 22 (22)
  • [35] ADVERSARIAL-PLAYGROUND: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning
    Norton, Andrew P.
    Qi, Yanjun
    2017 IEEE SYMPOSIUM ON VISUALIZATION FOR CYBER SECURITY (VIZSEC), 2017,
  • [36] Detection of sensors used for adversarial examples against machine learning models
    Kurniawan, Ade
    Ohsita, Yuichi
    Murata, Masayuki
    Results in Engineering, 2024, 24
  • [37] Privacy Risks of Securing Machine Learning Models against Adversarial Examples
    Song, Liwei
    Shokri, Reza
    Mittal, Prateek
    PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 241 - 257
  • [38] EXPLAINING DEEP MODELS THROUGH FORGETTABLE LEARNING DYNAMICS
    Benkert, Ryan
    Aribido, Oluwaseun Joseph
    AlRegib, Ghassan
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3692 - 3696
  • [39] Explaining Deep Learning-Based Driver Models
    Lorente, Maria Paz Sesmero
    Lopez, Elena Magan
    Florez, Laura Alvarez
    Espino, Agapito Ledezma
    Martinez, Jose Antonio Iglesias
    de Miguel, Araceli Sanchis
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [40] Explaining Deep Learning Models for Low Vision Prognosis
    Gui, Haiwen
    Tseng, Benjamin
    Hu, Wendeng
    Wang, Sophia Y.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2022, 63 (07)