On generating trustworthy counterfactual explanations

被引:30
|
作者
Del Ser, Javier [1 ,2 ]
Barredo-Arrieta, Alejandro [3 ]
Diaz-Rodriguez, Natalia [4 ]
Herrera, Francisco [4 ]
Saranti, Anna [5 ,6 ]
Holzinger, Andreas [5 ,6 ,7 ,8 ]
机构
[1] TECNALIA, Basque Res & Technol Alliance BRTA, P Tecnol Ed 700, Derio 48160, Spain
[2] Univ Basque Country, UPV EHU, Dept Commun Engn, E-48013 Bilbao, Spain
[3] Kurago Software, Bilbao 48011, Spain
[4] Univ Granada, DaSCI Andalusian Inst Data Sci & Computat Intellig, Data Sci & Computat Intelligence, Granada 18071, Spain
[5] Univ Nat Resources & Life Sci Vienna, Human Ctr AI Lab, A-1190 Vienna, Austria
[6] Med Univ Graz, A-8036 Graz, Austria
[7] Alberta Machine Intelligence Inst, xAI Lab, Edmonton, AB, Canada
[8] Univ Nat Resources & Life Sci Vienna, Peter Jordan Str 82, A-1190 Vienna, Austria
基金
奥地利科学基金会;
关键词
Explainable artificial intelligence; Deep learning; Counterfactual explanations; Generative adversarial networks; Multi-objective optimization; CLASSIFICATION;
D O I
10.1016/j.ins.2023.119898
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning models like chatGPT exemplify AI success but necessitate a deeper understanding of trust in critical sectors. Trust can be achieved using counterfactual explanations, which is how humans become familiar with unknown processes; by understanding the hypothetical input circumstances under which the output changes. We argue that the generation of counterfactual explanations requires several aspects of the generated counterfactual instances, not just their counterfactual ability. We present a framework for generating counterfactual explanations that formulate its goal as a multiobjective optimization problem balancing three objectives: plausibility; the intensity of changes; and adversarial power. We use a generative adversarial network to model the distribution of the input, along with a multiobjective counterfactual discovery solver balancing these objectives. We demonstrate the usefulness of six classification tasks with image and 3D data confirming with evidence the existence of a trade-off between the objectives, the consistency of the produced counterfactual explanations with human knowledge, and the capability of the framework to unveil the existence of concept -based biases and misrepresented attributes in the input domain of the audited model. Our pioneering effort shall inspire further work on the generation of plausible counterfactual explanations in real -world scenarios where attribute -/concept -based annotations are available for the domain under analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Generating Robust Counterfactual Explanations
    Guyomard, Victor
    Fessant, Francoise
    Guyet, Thomas
    Bouadi, Tassadit
    Termier, Alexandre
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT III, 2023, 14171 : 394 - 409
  • [2] Generating Natural Counterfactual Visual Explanations
    Zhao, Wenqi
    Oyama, Satoshi
    Kurihara, Masahito
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5204 - 5205
  • [3] DeltaExplainer: A Software Debugging Approach to Generating Counterfactual Explanations
    Shree, Sunny
    Chandrasekaran, Jaganmohan
    Lei, Yu
    Kacker, Raghu N.
    Kuhn, D. Richard
    2022 FOURTH IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST 2022), 2022, : 103 - 110
  • [4] Generating Sparse Counterfactual Explanations for Multivariate Time Series
    Lang, Jana
    Giese, Martin A.
    Ilg, Winfried
    Otte, Sebastian
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 180 - 193
  • [5] Ask to Know More: Generating Counterfactual Explanations for Fake Claims
    Dai, Shih-Chieh
    Hsu, Yi-Li
    Xiong, Aiping
    Ku, Lun-Wei
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2800 - 2810
  • [6] A General Search-Based Framework for Generating Textual Counterfactual Explanations
    Gilo, Daniel
    Markovitch, Shaul
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18073 - 18081
  • [7] A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations
    Smyth, Barry
    Keane, Mark T.
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, ICCBR 2022, 2022, 13405 : 18 - 32
  • [8] Generating Counterfactual Explanations For Causal Inference in Breast Cancer Treatment Response
    Zhou, Siqiong
    Pfeiffer, Nicholaus
    Islam, Upala J.
    Banerjee, Imon
    Patel, Bhavika K.
    Iquebal, Ashif S.
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 955 - 960
  • [9] CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines
    Akula, Arjun R.
    Wang, Shuai
    Zhu, Song-Chun
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2594 - 2601
  • [10] Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties
    Schut, Lisa
    Key, Oscar
    McGrath, Rory
    Costabellot, Luca
    Sacaleanut, Bogdan
    Corcoran, Medb
    Galt, Yarin
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130