Learning to Explain: A Model -Agnostic Framework for Explaining Black Box Models

被引：2

作者：

Barkan, Oren ^{[1
]}

Asher, Yuval ^{[2
]}

Eshel, Amit ^{[2
]}

Elisha, Yelionatan ^{[1
]}

Koenigstein, Noam ^{[2
]}

机构：

[1] Open Univ, Milton Keynes, England

[2] Tel Aviv Univ, Tel Aviv, Israel

来源：

23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023 | 2023年

基金：

以色列科学基金会;

关键词：

Explainable AI; computer vision; transformers;

D O I：

10.1109/ICDM58522.2023.00105

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Learning to Explain (LTX), a model-agnostic framework designed for providing post -hoc explanations for vision models. The LTX framework introduces an "explainer" model that generates explanation maps, highlighting the crucial regions that justify the predictions made by the model being explained. To train the explainer, we employ a two -stage process consisting of initial pretraining followed by per-instance finetuning. During both stages of training, we utilize a unique configuration where we compare the explained model's prediction for a masked input with its original prediction for the unmasked input. This approach enables the use of a novel counterfactual objective, which aims to anticipate the model's output using masked versions of the input image. Importantly, the LTX framework is not restricted to a specific model architecture and can provide explanations for both Transformer-based and convolutional models. Through our evaluations, we demonstrate that LTX significantly outperforms the current state-of-the-art in explainability across various metrics. Our code is available at: https://githab.cian/LTX-CodelLTX

引用

页码：944 / 949

页数：6

共 50 条

[21] Explaining any black box model using real data
Bjorklund, Anton
Henelius, Andreas
Oikarinen, Emilia
Kallonen, Kimmo
Puolamaki, Kai
FRONTIERS IN COMPUTER SCIENCE, 2023, 5
[22] Analyzing and Explaining Black-Box Models for Online Malware Detection
Manthena, Harikha
Kimmel, Jeffrey C.
Abdelsalam, Mahmoud
Gupta, Maanak
IEEE ACCESS, 2023, 11 : 25237 - 25252
[23] Modified Transformer Architecture to Explain Black Box Models in Narrative Form
Malhotra, Diksha
Saini, Poonam
Singh, Awadhesh Kumar
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2022, 18 (01)
[24] AcME-Accelerated model-agnostic explanations: Fast whitening of the machine-learning black box
Dandolo, David
Masiero, Chiara
Carletti, Mattia
Pezze, Davide Dalle
Susto, Gian Antonio
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214
[25] The goal of explaining black boxes in EEG seizure prediction is not to explain models' decisions
Pinto, Mauro F.
Batista, Joana
Leal, Adriana
Lopes, Fabio
Oliveira, Ana
Dourado, Antonio
Abuhaiba, Sulaiman I.
Sales, Francisco
Martins, Pedro
Teixeira, Cesar A.
EPILEPSIA OPEN, 2023, 8 (02) : 285 - 297
[26] Testing Framework for Black-box AI Models
Aggarwal, Aniya
Shaikh, Samiulla
Hans, Sandeep
Haldar, Swastik
Ananthanarayanan, Rema
Saha, Diptikalyan
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 81 - 84
[27] Explaining Black Box Reinforcement Learning Agents Through Counterfactual Policies
Movin, Maria
Dinis Junior, Guilherme
Hollmen, Jaakko
Papapetrou, Panagiotis
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023, 2023, 13876 : 314 - 326
[28] Explaining Black Boxes With a SMILE: Statistical Model-Agnostic Interpretability With Local Explanations
Aslansefat, Koorosh
Hashemian, Mojgan
Walker, Martin
Akram, Mohammed Naveed
Sorokos, Ioannis
Papadopoulos, Yiannis
IEEE SOFTWARE, 2024, 41 (01) : 87 - 97
[29] Agnostic proper learning of monotone functions: beyond the black-box correction barrier
Lange, Jane
Vasilyan, Arsen
2023 IEEE 64TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, FOCS, 2023, : 1149 - 1170
[30] Unveiling the Black Box: A Unified XAI Framework for Signal-Based Deep Learning Models
Shojaeinasab, Ardeshir
Jalayer, Masoud
Baniasadi, Amirali
Najjaran, Homayoun
MACHINES, 2024, 12 (02)

← 1 2 3 4 5 →