Explain the Explainer: Interpreting Model-Agnostic Counterfactual Explanations of a Deep Reinforcement Learning Agent

被引:4
|
作者
Chen Z. [1 ]
Silvestri F. [2 ]
Tolomei G. [2 ]
Wang J. [3 ]
Zhu H. [4 ]
Ahn H. [1 ]
机构
[1] Stony Brook University, Department of Applied Mathematics and Statistics, Stony Brook, 11794, NY
[2] Sapienza University of Rome, Department of Computer Engineering, The Department of Computer Science, Rome
[3] Xi'An Jiaotong-Liverpool University, Department of Intelligent Science, Suzhou
[4] Rutgers University-New Brunswick, Department of Computer Science, Piscataway, 08854, NJ
来源
关键词
Counterfactual explanations; deep reinforcement learning (DRL); explainable artificial intelligence (XAI); machine learning (ML) explainability;
D O I
10.1109/TAI.2022.3223892
中图分类号
学科分类号
摘要
Counterfactual examples (CFs) are one of the most popular methods for attaching post hoc explanations to machine learning models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood; thus, they are hard to generalize for complex models and inefficient for large datasets. This article aims to overcome these limitations and introduces ReLAX, a model-agnostic algorithm to generate optimal counterfactual explanations. Specifically, we formulate the problem of crafting CFs as a sequential decision-making task. We then find the optimal CFs via deep reinforcement learning (DRL) with discrete-continuous hybrid action space. In addition, we develop a distillation algorithm to extract decision rules from the DRL agent's policy in the form of a decision tree to make the process of generating CFs itself interpretable. Extensive experiments conducted on six tabular datasets have shown that ReLAX outperforms existing CF generation baselines, as it produces sparser counterfactuals, is more scalable to complex target models to explain, and generalizes to both the classification and regression tasks. Finally, we show the ability of our method to provide actionable recommendations and distill interpretable policy explanations in two practical real-world use cases. © 2020 IEEE.
引用
下载
收藏
页码:1443 / 1457
页数:14
相关论文
共 50 条
  • [1] Learning Model-Agnostic Counterfactual Explanations for Tabular Data
    Pawelczyk, Martin
    Broelemann, Klaus
    Kasneci, Gjergji
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 3126 - 3132
  • [2] Model-Agnostic Counterfactual Explanations in Credit Scoring
    Dastile, Xolani
    Celik, Turgay
    Vandierendonck, Hans
    IEEE ACCESS, 2022, 10 : 69543 - 69554
  • [3] Model-Agnostic Counterfactual Explanations for Consequential Decisions
    Karimi, Amir-Hossein
    Barthe, Gilles
    Balle, Borja
    Valera, Isabel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 895 - 904
  • [4] RelEx: A Model-Agnostic Relational Model Explainer
    Zhang, Yue
    Defazio, David
    Ramesh, Arti
    AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 1042 - 1049
  • [5] MANE: Model-Agnostic Non-linear Explanations for Deep Learning Model
    Tian, Yue
    Liu, Guanjun
    2020 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2020, : 33 - 36
  • [6] Plug-and-Play Model-Agnostic Counterfactual Policy Synthesis for Deep Reinforcement Learning-Based Recommendation
    Wang, Siyu
    Chen, Xiaocong
    McAuley, Julian
    Cripps, Sally
    Yao, Lina
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
  • [7] Individualized help for at-risk students using model-agnostic and counterfactual explanations
    Smith, Bevan, I
    Chimedza, Charles
    Buhrmann, Jacoba H.
    EDUCATION AND INFORMATION TECHNOLOGIES, 2022, 27 (02) : 1539 - 1558
  • [8] CountARFactuals - Generating Plausible Model-Agnostic Counterfactual Explanations with Adversarial Random Forests
    Dandl, Susanne
    Blesch, Kristin
    Freiesleben, Timo
    Koenig, Gunnar
    Kapar, Jan
    Bischl, Bernd
    Wright, Marvin N.
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, PT III, XAI 2024, 2024, 2155 : 85 - 107
  • [9] Individualized help for at-risk students using model-agnostic and counterfactual explanations
    Bevan I. Smith
    Charles Chimedza
    Jacoba H. Bührmann
    Education and Information Technologies, 2022, 27 : 1539 - 1558
  • [10] Multi-Agent Chronological Planning with Model-Agnostic Meta Reinforcement Learning
    Hu, Cong
    Xu, Kai
    Zhu, Zhengqiu
    Qin, Long
    Yin, Quanjun
    APPLIED SCIENCES-BASEL, 2023, 13 (16):