DiConStruct: Causal Concept-based Explanations through Black-Box Distillation

被引:0
|
作者
Moreira, Ricardo [1 ]
Bono, Jacopo [1 ]
Cardoso, Mario [1 ]
Saleiro, Pedro [1 ]
Figueiredo, Mario [2 ]
Bizarro, Pedro [1 ]
机构
[1] Feedzai, Coimbra, Portugal
[2] ULisboa, Inst Super Tecn, ELLIS Unit Lisbon, Inst Telecomunicacoes, Lisbon, Portugal
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model interpretability plays a central role in human-AI decision-making systems. Ideally, explanations should be expressed using human-interpretable semantic concepts. Moreover, the causal relations between these concepts should be captured by the explainer to allow for reasoning about the explanations. Lastly, explanation methods should be efficient and not compromise the predictive task performance. Despite the recent rapid advances in AI explainability, as far as we know, no method yet fulfills these three desiderata. Indeed, mainstream methods for local concept explainability do not yield causal explanations and incur a trade-off between explainability and prediction accuracy. We present DiConStruct, an explanation method that is both concept-based and causal, which produces more interpretable local explanations in the form of structural causal models and concept attributions. Our explainer works as a distillation model to any black-box machine learning model by approximating its predictions while producing the respective explanations. Consequently, DiConStruct generates explanations efficiently while not impacting the black-box prediction task. We validate our method on an image dataset and a tabular dataset, showing that DiConStruct approximates the black-box models with higher fidelity than other concept explainability baselines, while providing explanations that include the causal relations between the concepts. [GRAPHICS] .
引用
收藏
页码:740 / 768
页数:29
相关论文
共 50 条
  • [41] Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability
    Ramaswamy, Vikram V.
    Kim, Sunnie S. Y.
    Fong, Ruth
    Russakovsky, Olga
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10932 - 10941
  • [42] Validating Automatic Concept-Based Explanations for AI-Based Digital Histopathology
    Sauter, Daniel
    Lodde, Georg
    Nensa, Felix
    Schadendorf, Dirk
    Livingstone, Elisabeth
    Kukuk, Markus
    SENSORS, 2022, 22 (14)
  • [43] Promoting robust black-box solvers through competitions
    Christophe Lecoutre
    Olivier Roussel
    M. R. C. van Dongen
    Constraints, 2010, 15 : 317 - 326
  • [44] On Completeness-aware Concept-Based Explanations in Deep Neural Networks
    Yeh, Chih-Kuan
    Kim, Been
    Arik, Sercan O.
    Li, Chun-Liang
    Pfister, Tomas
    Ravikumar, Pradeep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [45] Promoting robust black-box solvers through competitions
    Lecoutre, Christophe
    Roussel, Olivier
    van Dongen, M. R. C.
    CONSTRAINTS, 2010, 15 (03) : 317 - 326
  • [46] Corpus-level and Concept-based Explanations for Interpretable Document Classification
    Shi, Tian
    Zhang, Xuchao
    Wang, Ping
    Reddy, Chandan K.
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (03)
  • [47] Ranking-Based Black-Box Complexity
    Doerr, Benjamin
    Winzen, Carola
    ALGORITHMICA, 2014, 68 (03) : 571 - 609
  • [48] Ranking-Based Black-Box Complexity
    Benjamin Doerr
    Carola Winzen
    Algorithmica, 2014, 68 : 571 - 609
  • [49] A MODEL DISTILLATION APPROACH FOR EXPLAINING BLACK-BOX MODELS FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Taskin, Gulsen
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 3592 - 3595
  • [50] CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models
    Sharma, Shubham
    Henderson, Jette
    Ghosh, Joydeep
    PROCEEDINGS OF THE 3RD AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY AIES 2020, 2020, : 166 - 172