Adversarial Counterfactual Visual Explanations

被引:5
|
作者
Jeanneret, Guillaume [1 ]
Simon, Loic [1 ]
Jurie, Frederic [1 ]
机构
[1] Univ Caen Normandie, ENSICAEN, CNRS, Caen, France
关键词
D O I
10.1109/CVPR52729.2023.01576
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. Yet, adversarial attacks cannot be used directly in a counterfactual explanation perspective, as such perturbations are perceived as noise and not as actionable and understandable image modifications. Building on the robust learning literature, this paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations, without modifying the classifiers to explain. The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations when generating adversarial attacks. The paper's key idea is to build attacks through a diffusion model to polish them. This allows studying the target model regardless of its robustification level. Extensive experimentation shows the advantages of our counterfactual explanation approach over current State-of-the-Art in multiple testbeds.
引用
收藏
页码:16425 / 16435
页数:11
相关论文
共 50 条
  • [31] Counterfactual-based Saliency Map: Towards Visual Contrastive Explanations for Neural Networks
    Wang, Xue
    Wang, Zhibo
    Weng, Haiqin
    Guo, Hengchang
    Zhang, Zhifei
    Jin, Lu
    Wei, Tao
    Ren, Kui
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2042 - 2051
  • [32] The Robustness of Counterfactual Explanations Over Time
    Ferrario, Andrea
    Loi, Michele
    [J]. IEEE ACCESS, 2022, 10 : 82736 - 82750
  • [33] Counterfactual Explanations for Natural Language Interfaces
    Tolkachev, George
    Mell, Stephen
    Zdancewic, Steve
    Bastani, Osbert
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 113 - 118
  • [34] STEEX: Steering Counterfactual Explanations with Semantics
    Jacob, Paul
    Zablocki, Eloi
    Ben-Younes, Hedi
    Chen, Mickael
    Perez, Patrick
    Cord, Matthieu
    [J]. COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 387 - 403
  • [35] Optimal Counterfactual Explanations in Tree Ensembles
    Parmentier, Axel
    Vida, Thibaut
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [36] Scaling Guarantees for Nearest Counterfactual Explanations
    Mohammadi, Kiarash
    Karimi, Amir-Hossein
    Barthe, Gilles
    Valera, Isabel
    [J]. AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 177 - 187
  • [37] Ijuice: integer JUstIfied counterfactual explanations
    Kuratomi, Alejandro
    Miliou, Ioanna
    Lee, Zed
    Lindgren, Tony
    Papapetrou, Panagiotis
    [J]. MACHINE LEARNING, 2024, 113 (08) : 5731 - 5771
  • [38] Counterfactual Adversarial Learning for Recommendation
    Liu, Jialin
    Zhang, Zijian
    Zhao, Xiangyu
    Li, Jun
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 4115 - 4119
  • [39] Declarative Approaches to Counterfactual Explanations for Classification
    Bertossi, Leopoldo
    [J]. THEORY AND PRACTICE OF LOGIC PROGRAMMING, 2023, 23 (03) : 559 - 593
  • [40] Counterfactual Explanations in Explainable AI: A Tutorial
    Wang, Cong
    Li, Xiao-Hui
    Han, Haocheng
    Wang, Shendi
    Wang, Luning
    Cao, Caleb Chen
    Chen, Lei
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4080 - 4081