When are Post-hoc Conceptual Explanations Identifiable?

被引:0
|
作者
Leemann, Tobias [1 ,2 ]
Kirchhof, Michael [1 ]
Rong, Yao [1 ,2 ]
Kasneci, Enkelejda [2 ]
Kasneci, Gjergji [2 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] Tech Univ Munich, Munich, Germany
来源
关键词
INDEPENDENT COMPONENT ANALYSIS; NONLINEAR ICA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, concept discovery methods search trained embedding spaces for interpretable concepts like object shape or color that can provide post-hoc explanations for decisions. Unlike previous work, we argue that concept discovery should be identifiable, meaning that a number of known concepts can be provably recovered to guarantee reliability of the explanations. As a starting point, we explicitly make the connection between concept discovery and classical methods like Principal Component Analysis and Independent Component Analysis by showing that they can recover independent concepts under non-Gaussian distributions. For dependent concepts, we propose two novel approaches that exploit functional compositionality properties of image-generating processes. Our provably identifiable concept discovery methods substantially outperform competitors on a battery of experiments including hundreds of trained models and dependent concepts, where they exhibit up to 29% better alignment with the ground truth. Our results highlight the strict conditions under which reliable concept discovery without human labels can be guaranteed and provide a formal foundation for the domain. Our code is available online.
引用
收藏
页码:1207 / 1218
页数:12
相关论文
共 50 条
  • [21] Normalizing trust: Participants' immediately post-hoc explanations of behaviour in Milgram's "obedience' experiments
    Hollander, Matthew M.
    Turowetz, Jason
    BRITISH JOURNAL OF SOCIAL PSYCHOLOGY, 2017, 56 (04) : 655 - 674
  • [22] Post-hoc software realignment of PET to CT/MRI: Sufficient when necessary?
    Pavel, DG
    Eve, D
    Liu, PT
    Blend, MJ
    Rydzewski, B
    Chitimus, B
    Debruin, SW
    JOURNAL OF NUCLEAR MEDICINE, 2003, 44 (05) : 267P - 267P
  • [23] POST-HOC, NON-ERGO-PROPTER-HOC
    PEIRICK, J
    IEEE SPECTRUM, 1994, 31 (03) : 6 - 6
  • [24] Using ontologies to enhance human understandability of global post-hoc explanations of black-box models
    Confalonieri, Roberto
    Weyde, Tillman
    Besold, Tarek R.
    Martin, Fermin Moscoso del Prado
    ARTIFICIAL INTELLIGENCE, 2021, 296
  • [25] LLMs for the post-hoc creation of provenance
    Almuntashiri, Abdullah Hamed
    Ibanez, Luis-Daniel
    Chapman, Adriane
    9TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, EUROS&PW 2024, 2024, : 562 - 566
  • [26] SETTING REJECTION RATE FOR CONTRASTS SELECTED POST-HOC WHEN SOME NULLS ARE FALSE
    RODGER, RS
    BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1975, 28 (NOV): : 214 - 232
  • [27] Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective
    Wang, Jihong
    Luo, Minnan
    Li, Jundong
    Lin, Yun
    Dong, Yushun
    Dong, Jin Song
    Zheng, Qinghua
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 2349 - 2360
  • [28] Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes
    Chari, Shruthi
    Acharya, Prasant
    Gruen, Daniel M.
    Zhang, Olivia
    Eyigoz, Elif K.
    Ghalwash, Mohamed
    Seneviratne, Oshani
    Saiz, Fernando Suarez
    Meyer, Pablo
    Chakraborty, Prithwish
    McGuinness, Deborah L.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 137
  • [29] Why Don't XAI Techniques Agree? Characterizing the Disagreements Between Post-hoc Explanations of Defect Predictions
    Roy, Saumendu
    Laberge, Gabriel
    Roy, Banani
    Khomh, Foutse
    Nikanjam, Amin
    Mondal, Saikat
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2022), 2022, : 444 - 448
  • [30] Generating post-hoc explanations for Skip-gram-based node embeddings by identifying important nodes with bridgeness
    Park, Hogun
    Neville, Jennifer
    NEURAL NETWORKS, 2023, 164 : 546 - 561