Explaining by Removing: A Unified Framework for Model Explanation

被引:0
|
作者
Covert, Ian C. [1 ]
Lundberg, Scott [2 ]
Lee, Su-In [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci Engn, Seattle, WA 98195 USA
[2] Microsoft Corp, Microsoft Res, Redmond, WA 98052 USA
关键词
Model explanation; interpretability; information theory; cooperative game theory; psychology; BLACK-BOX; CLASSIFICATIONS; EXPRESSION; REGRESSION; DECISIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Researchers have proposed a wide variety of model explanation approaches, but it remains unclear how most methods are related or when one method is preferable to another. We describe a new unified class of methods, removal-based explanations, that are based on the principle of simulating feature removal to quantify each feature's influence. These methods vary in several respects, so we develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. Our framework unifies 26 existing methods, including several of the most widely used approaches: SHAP, LIME, Meaningful Perturbations, and permutation tests. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature. To anchor removal-based explanations in cognitive psychology, we show that feature removal is a simple application of subtractive counterfactual reasoning. Ideas from cooperative game theory shed light on the relationships and trade-offs among different methods, and we derive conditions under which all removal-based explanations have information-theoretic interpretations. Through this analysis, we develop a unified framework that helps practitioners better understand model explanation tools, and that offers a strong theoretical foundation upon which future explainability research can build.
引用
收藏
页数:90
相关论文
共 50 条
  • [41] A proposal of assembly model framework specialized for unified parametrics
    Sanami, S
    Yoshida, N
    Kitajima, K
    INITIATIVES OF PRECISION ENGINEERING AT THE BEGINNING OF A MILLENNIUM, 2001, : 962 - 966
  • [42] A Unified Energy Minimization Framework for Model Fitting in Depth
    Ren, Carl Yuheng
    Reid, Ian
    COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 72 - 82
  • [43] Model Compression with Adversarial Robustness: A Unified Optimization Framework
    Gui, Shupeng
    Wang, Haotao
    Yang, Haichuan
    Yu, Chen
    Wang, Zhangyang
    Liu, Ji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [44] A Unified Framework for Optimality Analysis of Model Predictive Control
    Cai, Xin
    Li, Shaoyuan
    Li, Ning
    Li, Kang
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1688 - 1693
  • [45] The SCEC Unified Community Velocity Model Software Framework
    Small, Patrick
    Gill, David
    Maechling, Philip J.
    Taborda, Ricardo
    Callaghan, Scott
    Jordan, Thomas H.
    Olsen, Kim B.
    Ely, Geoffrey P.
    Goulet, Christine
    SEISMOLOGICAL RESEARCH LETTERS, 2017, 88 (06) : 1539 - 1552
  • [46] Framework for a Unified Model for Nucleate and Transition Pool Boiling
    Dhir, V. K.
    Liaw, S. P.
    JOURNAL OF HEAT TRANSFER-TRANSACTIONS OF THE ASME, 1989, 111 (1-4): : 739 - 746
  • [47] Explaining a Black-Box Sentiment Analysis Model with Local Interpretable Model Diagnostics Explanation (LIME)
    Chowdhury, Kounteyo Roy
    Sil, Arpan
    Shukla, Sharvari Rahul
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 90 - 101
  • [48] A unified framework and a classification scheme to model production systems
    Artiba, A
    Botta, V
    Guinet, A
    Hentous, H
    Levecq, P
    Riane, F
    FIRST INTERNATIONAL CONFERENCE ON OPERATIONS AND QUANTITATIVE MANAGEMENT, VOL 1 AND 2, 1997, : 467 - 474
  • [49] Unified Model and Framework for Interactive Mixed Entity Systems
    Bataille, Guillaume
    Gouranton, Valerie
    Lacoche, Jeremy
    Pele, Danielle
    Arnaldi, Bruno
    COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VISIGRAPP 2020, 2022, 1474 : 3 - 25
  • [50] A unified model sharing framework for moving object detection
    Chen, Yingying
    Wang, Jinqiao
    Xu, Min
    He, Xiangjian
    Lu, Hanqing
    SIGNAL PROCESSING, 2016, 124 : 72 - 80