Adversarial Counterfactual Visual Explanations

被引:5
|
作者
Jeanneret, Guillaume [1 ]
Simon, Loic [1 ]
Jurie, Frederic [1 ]
机构
[1] Univ Caen Normandie, ENSICAEN, CNRS, Caen, France
关键词
D O I
10.1109/CVPR52729.2023.01576
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. Yet, adversarial attacks cannot be used directly in a counterfactual explanation perspective, as such perturbations are perceived as noise and not as actionable and understandable image modifications. Building on the robust learning literature, this paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations, without modifying the classifiers to explain. The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations when generating adversarial attacks. The paper's key idea is to build attacks through a diffusion model to polish them. This allows studying the target model regardless of its robustification level. Extensive experimentation shows the advantages of our counterfactual explanation approach over current State-of-the-Art in multiple testbeds.
引用
收藏
页码:16425 / 16435
页数:11
相关论文
共 50 条
  • [41] Counterfactual explanations as interventions in latent space
    Crupi, Riccardo
    Castelnovo, Alessandro
    Regoli, Daniele
    San Miguel Gonzalez, Beatriz
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (05) : 2733 - 2769
  • [42] Metaphysical explanations and the counterfactual theory of explanation
    Roski, Stefan
    [J]. PHILOSOPHICAL STUDIES, 2021, 178 (06) : 1971 - 1991
  • [43] Adversarial learning for counterfactual fairness
    Vincent Grari
    Sylvain Lamprier
    Marcin Detyniecki
    [J]. Machine Learning, 2023, 112 : 741 - 763
  • [44] Adversarial learning for counterfactual fairness
    Grari, Vincent
    Lamprier, Sylvain
    Detyniecki, Marcin
    [J]. MACHINE LEARNING, 2023, 112 (03) : 741 - 763
  • [45] Counterfactual Explanations for Time Series Forecasting
    Wang, Zhendong
    Miliou, Ioanna
    Samsten, Isak
    Papapetrou, Panagiotis
    [J]. 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1391 - 1396
  • [46] Spontaneous counterfactual thoughts and causal explanations
    McEleney, Alice
    Byrne, Ruth M. J.
    [J]. THINKING & REASONING, 2006, 12 (02) : 235 - 255
  • [47] Metaphysical explanations and the counterfactual theory of explanation
    Stefan Roski
    [J]. Philosophical Studies, 2021, 178 : 1971 - 1991
  • [48] Counterfactual Explanations for Prediction and Diagnosis in XAI
    Dai, Xinyue
    Keane, Mark T.
    Shalloo, Laurence
    Ruelle, Elodie
    Byrne, Ruth M. J.
    [J]. PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 215 - 226
  • [49] Interval abstractions for robust counterfactual explanations
    Jiang, Junqi
    Leofante, Francesco
    Rago, Antonio
    Toni, Francesca
    [J]. ARTIFICIAL INTELLIGENCE, 2024, 336
  • [50] Counterfactual Models for Fair and Adequate Explanations
    Asher, Nicholas
    De Lara, Lucas
    Paul, Soumya
    Russell, Chris
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2022, 4 (02): : 371 - 396