Contrastive Explanations for Model Interpretability

被引:0
|
作者
Jacovi, Alon [1 ]
Swayamdipta, Swabha [2 ]
Ravfogel, Shauli [1 ,2 ]
Elazar, Yanai [1 ,2 ]
Choi, Yejin [2 ,3 ]
Goldberg, Yoav [1 ,2 ]
机构
[1] Bar Ilan Univ, Ramat Gan, Israel
[2] Allen Inst Artificial Intelligence, Seattle, WA USA
[3] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive explanations clarify why an event occurred in contrast to another. They are inherently intuitive to humans to both produce and comprehend. We propose a method to produce contrastive explanations in the latent space, via a projection of the input representation, such that only the features that differentiate two potential decisions are captured. Our modification allows model behavior to consider only contrastive reasoning, and uncover which aspects of the input are useful for and against particular decisions. Additionally, for a given input feature, our contrastive explanations can answer for which label, and against which alternative label, is the feature useful. We produce contrastive explanations via both highlevel abstract concept attribution and low-level input token/span attribution for two NLP classification benchmarks. Our findings demonstrate the ability of label-contrastive explanations to provide fine-grained interpretability of model decisions.
引用
收藏
页码:1597 / 1611
页数:15
相关论文
共 50 条
  • [1] On the Granularity of Explanations in Model Agnostic NLP Interpretability
    Rychener, Yves
    Renard, Xavier
    Seddah, Djame
    Frossard, Pascal
    Detyniecki, Marcin
    [J]. MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 1752 : 498 - 512
  • [2] Contrastive Explanations for Explaining Model Adaptations
    Artelt, Andre
    Hinder, Fabian
    Vaquet, Valerie
    Feldhans, Robert
    Hammer, Barbara
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2021, PT I, 2021, 12861 : 101 - 112
  • [3] Contrastive explanations of plans through model restrictions
    Krarup, Benjamin
    Krivic, Senka
    Magazzeni, Daniele
    Long, Derek
    Cashmore, Michael
    Smith, David E.
    [J]. Journal of Artificial Intelligence Research, 2021, 72 : 533 - 612
  • [4] Model Agnostic Contrastive Explanations for Classification Models
    Dhurandhar, Amit
    Pedapati, Tejaswini
    Balakrishnan, Avinash
    Chen, Pin-Yu
    Shanmugam, Karthikeyan
    Puri, Ruchir
    [J]. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2024, 14 (04) : 789 - 798
  • [5] Contrastive Explanations of Plans Through Model Restrictions
    Krarup, Benjamin
    Krivic, Senka
    Magazzeni, Daniele
    Long, Derek
    Cashmore, Michael
    Smith, David E.
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 72 : 533 - 612
  • [6] Model-contrastive explanations through symbolic reasoning
    Malandri, Lorenzo
    Mercorio, Fabio
    Mezzanzanica, Mario
    Seveso, Andrea
    [J]. DECISION SUPPORT SYSTEMS, 2024, 176
  • [7] The demand for contrastive explanations
    Nadine Elzein
    [J]. Philosophical Studies, 2019, 176 : 1325 - 1339
  • [8] The demand for contrastive explanations
    Elzein, Nadine
    [J]. PHILOSOPHICAL STUDIES, 2019, 176 (05) : 1325 - 1339
  • [9] Beyond model interpretability: socio-structural explanations in machine learning
    Smart, Andrew
    Kasirzadeh, Atoosa
    [J]. AI & SOCIETY, 2024,
  • [10] Contrastive explanations: A dilemma for libertarians
    Levy, N
    [J]. DIALECTICA, 2005, 59 (01) : 51 - 61