Explain the Explainer: Interpreting Model-Agnostic Counterfactual Explanations of a Deep Reinforcement Learning Agent

被引：6

作者：

Chen Z. ^{[1
]}

Silvestri F. ^{[2
]}

Tolomei G. ^{[2
]}

Wang J. ^{[3
]}

Zhu H. ^{[4
]}

Ahn H. ^{[1
]}

机构：

[1] Stony Brook University, Department of Applied Mathematics and Statistics, Stony Brook, 11794, NY

[2] Sapienza University of Rome, Department of Computer Engineering, The Department of Computer Science, Rome

[3] Xi'An Jiaotong-Liverpool University, Department of Intelligent Science, Suzhou

[4] Rutgers University-New Brunswick, Department of Computer Science, Piscataway, 08854, NJ

来源：

IEEE Transactions on Artificial Intelligence | 2024年 / 5卷 / 04期

关键词：

Counterfactual explanations; deep reinforcement learning (DRL); explainable artificial intelligence (XAI); machine learning (ML) explainability;

D O I：

10.1109/TAI.2022.3223892

中图分类号：

学科分类号：

摘要：

Counterfactual examples (CFs) are one of the most popular methods for attaching post hoc explanations to machine learning models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood; thus, they are hard to generalize for complex models and inefficient for large datasets. This article aims to overcome these limitations and introduces ReLAX, a model-agnostic algorithm to generate optimal counterfactual explanations. Specifically, we formulate the problem of crafting CFs as a sequential decision-making task. We then find the optimal CFs via deep reinforcement learning (DRL) with discrete-continuous hybrid action space. In addition, we develop a distillation algorithm to extract decision rules from the DRL agent's policy in the form of a decision tree to make the process of generating CFs itself interpretable. Extensive experiments conducted on six tabular datasets have shown that ReLAX outperforms existing CF generation baselines, as it produces sparser counterfactuals, is more scalable to complex target models to explain, and generalizes to both the classification and regression tasks. Finally, we show the ability of our method to provide actionable recommendations and distill interpretable policy explanations in two practical real-world use cases. © 2020 IEEE.

引用

页码：1443 / 1457

页数：14

共 50 条

[11] CountARFactuals - Generating Plausible Model-Agnostic Counterfactual Explanations with Adversarial Random Forests
Dandl, Susanne
Blesch, Kristin
Freiesleben, Timo
Koenig, Gunnar
Kapar, Jan
Bischl, Bernd
Wright, Marvin N.
EXPLAINABLE ARTIFICIAL INTELLIGENCE, PT III, XAI 2024, 2024, 2155 : 85 - 107
[12] Individualized help for at-risk students using model-agnostic and counterfactual explanations
Bevan I. Smith
Charles Chimedza
Jacoba H. Bührmann
Education and Information Technologies, 2022, 27 : 1539 - 1558
[13] Multi-Agent Chronological Planning with Model-Agnostic Meta Reinforcement Learning
Hu, Cong
Xu, Kai
Zhu, Zhengqiu
Qin, Long
Yin, Quanjun
APPLIED SCIENCES-BASEL, 2023, 13 (16):
[14] Toward Learning Model-Agnostic Explanations for Deep Learning-Based Signal Modulation Classifiers
Tian, Yunzhe
Xu, Dongyue
Tong, Endong
Sun, Rui
Chen, Kang
Li, Yike
Baker, Thar
Niu, Wenjia
Liu, Jiqiang
IEEE TRANSACTIONS ON RELIABILITY, 2024, 73 (03) : 1529 - 1543
[15] BayCon: Model-agnostic Bayesian Counterfactual Generator
Romashov, Piotr
Gjoreski, Martin
Sokol, Kacper
Martinez, Maria Vanina
Langheinrich, Marc
PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 740 - 746
[16] Real-Time, Model-Agnostic and User-Driven Counterfactual Explanations Using Autoencoders
Soto, Jokin Labaien
Uriguen, Ekhi Zugasti
Garcia, Xabier De Carlos
APPLIED SCIENCES-BASEL, 2023, 13 (05):
[17] Model-agnostic explanations for survival prediction models
Suresh, Krithika
Gorg, Carsten
Ghosh, Debashis
STATISTICS IN MEDICINE, 2024, 43 (11) : 2161 - 2182
[18] Semantic Reasoning from Model-Agnostic Explanations
Perdih, Timen Stepisnik
Lavrac, Nada
Skrlj, Blaz
2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 105 - 110
[19] Model agnostic generation of counterfactual explanations for molecules
Wellawatte, Geemi P.
Seshadri, Aditi
White, Andrew D.
CHEMICAL SCIENCE, 2022, 13 (13) : 3697 - 3705
[20] GrammarSHAP: An Efficient Model-Agnostic and Structure-Aware NLP Explainer
Mosca, Edoardo
Demituerk, Defne
Muelln, Luca
Raffagnato, Fabio
Groh, Georg
PROCEEDINGS OF THE FIRST WORKSHOP ON LEARNING WITH NATURAL LANGUAGE SUPERVISION (LNLS 2022), 2022, : 10 - 16

← 1 2 3 4 5 →