When are Post-hoc Conceptual Explanations Identifiable?

被引:0
|
作者
Leemann, Tobias [1 ,2 ]
Kirchhof, Michael [1 ]
Rong, Yao [1 ,2 ]
Kasneci, Enkelejda [2 ]
Kasneci, Gjergji [2 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] Tech Univ Munich, Munich, Germany
来源
关键词
INDEPENDENT COMPONENT ANALYSIS; NONLINEAR ICA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identifiable, meaning that a number of known concepts can be provably recovered to guarantee reliability of the explanations. As a starting point, we explicitly make the connection between concept discovery and classical methods like Principal Component Analysis and Independent Component Analysis by showing that they can recover independent concepts under non-Gaussian distributions. For dependent concepts, we propose two novel approaches that exploit functional compositionality properties of image-generating processes. Our provably identifiable concept discovery methods substantially outperform competitors on a battery of experiments including hundreds of trained models and dependent concepts, where they exhibit up to 29% better alignment with the ground truth. Our results highlight the strict conditions under which reliable concept discovery without human labels can be guaranteed and provide a formal foundation for the domain. Our code is available online.
引用
收藏
页码:1207 / 1218
页数:12
相关论文
共 50 条
  • [1] Generating Recommendations with Post-Hoc Explanations for Citizen Science
    Ben Zaken, Daniel
    Shani, Guy
    Segal, Avi
    Cavalier, Darlene
    Gal, Kobi
    PROCEEDINGS OF THE 30TH ACM CONFERENCE ON USER MODELING, ADAPTATION AND PERSONALIZATION, UMAP 2022, 2022, : 69 - 78
  • [2] The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations
    Laugel, Thibault
    Lesot, Marie-Jeanne
    Marsala, Christophe
    Renard, Xavier
    Detyniecki, Marcin
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2801 - 2807
  • [3] An Empirical Comparison of Interpretable Models to Post-Hoc Explanations
    Mahya, Parisa
    Fuernkranz, Johannes
    AI, 2023, 4 (02) : 426 - 436
  • [4] Evaluating Stability of Post-hoc Explanations for Business Process Predictions
    Velmurugan, Mythreyi
    Ouyang, Chun
    Moreira, Catarina
    Sindhgatta, Renuka
    SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 49 - 64
  • [5] Comparing Strategies for Post-Hoc Explanations in Machine Learning Models
    Vij, Aabhas
    Nanjundan, Preethi
    MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 585 - 592
  • [6] A Study on Trust in Black Box Models and Post-hoc Explanations
    El Bekri, Nadia
    Kling, Jasmin
    Huber, Marco F.
    14TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2019), 2020, 950 : 35 - 46
  • [7] Preference-based and local post-hoc explanations for recommender systems
    Brunot, Leo
    Canovas, Nicolas
    Chanson, Alexandre
    Labroche, Nicolas
    Verdeaux, Willeme
    INFORMATION SYSTEMS, 2022, 108
  • [8] Post-hoc Rule Based Explanations for Black Box Bayesian Optimization
    Chakraborty, Tanmay
    Wirth, Christian
    Seifert, Christin
    ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 320 - 337
  • [9] Post-hoc vs ante-hoc explanations: xAI design guidelines for data scientists
    Retzlaff, Carl O.
    Angerschmid, Alessa
    Saranti, Anna
    Schneeberger, David
    Roettger, Richard
    Mueller, Heimo
    Holzinger, Andreas
    COGNITIVE SYSTEMS RESEARCH, 2024, 86
  • [10] Can Post-hoc Explanations Effectively Detect Out-of-Distribution Samples?
    Martinez-Seras, Aitor
    Del Seel, Javier
    Garcia-Bringas, Pablo
    2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,