Interval abstractions for robust counterfactual explanations

被引:0
|
作者
Jiang, Junqi [1 ]
Leofante, Francesco [1 ]
Rago, Antonio [1 ]
Toni, Francesca [1 ]
机构
[1] Imperial Coll London, Dept Comp, 180 Queens Gate, London SW7 2AZ, England
基金
欧洲研究理事会;
关键词
Explainable AI; Counterfactual explanations; Algorithmic recourse; Robustness of explanations;
D O I
10.1016/j.artint.2024.104218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research, providing recourse recommendations for users affected by the decisions of machine learning models. However, CEs found by existing methods often become invalid when slight changes occur in the parameters of the model they were generated for. The literature lacks a way to provide exhaustive robustness guarantees for CEs under model changes, in that existing methods to improve CEs' robustness are mostly heuristic, and the robustness performances are evaluated empirically using only a limited number of retrained models. To bridge this gap, we propose a novel interval abstraction technique for parametric machine learning models, which allows us to obtain provable robustness guarantees for CEs under a possibly infinite set of plausible model changes Delta. Based on this idea, we formalise a robustness notion for CEs, which we call Delta-robustness, in both binary and multi-class classification settings. We present procedures to verify Delta-robustness based on Mixed Integer Linear Programming, using which we further propose algorithms to generate CEs that are Delta-robust. In an extensive empirical study involving neural networks and logistic regression models, we demonstrate the practical applicability of our approach. We discuss two strategies for determining the appropriate hyperparameters in our method, and we quantitatively benchmark CEs generated by eleven methods, highlighting the effectiveness of our algorithms in finding robust CEs.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] STEEX: Steering Counterfactual Explanations with Semantics
    Jacob, Paul
    Zablocki, Eloi
    Ben-Younes, Hedi
    Chen, Mickael
    Perez, Patrick
    Cord, Matthieu
    COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 387 - 403
  • [32] Faithful Counterfactual Visual Explanations (FCVE)
    Khan, Bismillah
    Tariq, Syed Ali
    Zia, Tehseen
    Ahsan, Muhammad
    Windridge, David
    KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [33] CLEAR: Generative Counterfactual Explanations on Graphs
    Ma, Jing
    Guo, Ruocheng
    Mishra, Saumitra
    Zhang, Aidong
    Li, Jundong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] Contrastive counterfactual visual explanations with overdetermination
    Adam White
    Kwun Ho Ngan
    James Phelan
    Kevin Ryan
    Saman Sadeghi Afgeh
    Constantino Carlos Reyes-Aldasoro
    Artur d’Avila Garcez
    Machine Learning, 2023, 112 : 3497 - 3525
  • [35] Scaling Guarantees for Nearest Counterfactual Explanations
    Mohammadi, Kiarash
    Karimi, Amir-Hossein
    Barthe, Gilles
    Valera, Isabel
    AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 177 - 187
  • [36] Declarative Approaches to Counterfactual Explanations for Classification
    Bertossi, Leopoldo
    THEORY AND PRACTICE OF LOGIC PROGRAMMING, 2023, 23 (03) : 559 - 593
  • [37] Counterfactual explanations as interventions in latent space
    Crupi, Riccardo
    Castelnovo, Alessandro
    Regoli, Daniele
    San Miguel Gonzalez, Beatriz
    DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (05) : 2733 - 2769
  • [38] Counterfactual Explanations in Explainable AI: A Tutorial
    Wang, Cong
    Li, Xiao-Hui
    Han, Haocheng
    Wang, Shendi
    Wang, Luning
    Cao, Caleb Chen
    Chen, Lei
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4080 - 4081
  • [39] Counterfactual Explanations for Sustainable Tourism Indicators
    Saugar, Javier
    Lancho, Carmen
    Cuesta, Marina
    Cano, Emilio L.
    Martin de Diego, Isaac
    Amado, Antonio
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 214 - 220
  • [40] Multi-Objective Counterfactual Explanations
    Dandl, Susanne
    Molnar, Christoph
    Binder, Martin
    Bischl, Bernd
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVI, PT I, 2020, 12269 : 448 - 469