Explaining by Removing: A Unified Framework for Model Explanation

被引:0
|
作者
Covert, Ian C. [1 ]
Lundberg, Scott [2 ]
Lee, Su-In [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci Engn, Seattle, WA 98195 USA
[2] Microsoft Corp, Microsoft Res, Redmond, WA 98052 USA
关键词
Model explanation; interpretability; information theory; cooperative game theory; psychology; BLACK-BOX; CLASSIFICATIONS; EXPRESSION; REGRESSION; DECISIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Researchers have proposed a wide variety of model explanation approaches, but it remains unclear how most methods are related or when one method is preferable to another. We describe a new unified class of methods, removal-based explanations, that are based on the principle of simulating feature removal to quantify each feature's influence. These methods vary in several respects, so we develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. Our framework unifies 26 existing methods, including several of the most widely used approaches: SHAP, LIME, Meaningful Perturbations, and permutation tests. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature. To anchor removal-based explanations in cognitive psychology, we show that feature removal is a simple application of subtractive counterfactual reasoning. Ideas from cooperative game theory shed light on the relationships and trade-offs among different methods, and we derive conditions under which all removal-based explanations have information-theoretic interpretations. Through this analysis, we develop a unified framework that helps practitioners better understand model explanation tools, and that offers a strong theoretical foundation upon which future explainability research can build.
引用
收藏
页数:90
相关论文
共 50 条
  • [31] A unified framework for model-based clustering
    Zhong, S
    Ghosh, J
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1001 - 1037
  • [32] Developing a unified framework of the business model concept
    Al-Debei, Mutaz M.
    Avison, David
    EUROPEAN JOURNAL OF INFORMATION SYSTEMS, 2010, 19 (03) : 359 - 376
  • [33] A Unified Framework of the Cloud Computing Service Model
    Wen-Lung Shiau
    Chao-Ming Hsiao
    Journal of Electronic Science and Technology, 2013, 11 (02) : 150 - 160
  • [34] Computational model of obsessive-compulsive disorder: A unified explanation of the treatments
    Kubota, S
    Aihara, K
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2004, 14 (01): : 263 - 277
  • [35] A unified explanation for anthocyanins in leaves?
    Gould, KS
    Neill, SO
    Vogelmann, TC
    ADVANCES IN BOTANICAL RESEARCH, VOL 37, 2002, 37 : 167 - 192
  • [36] Advocacy and explanation: The problems of explaining adversaries
    Voll, JO
    ADVOCACY IN THE CLASSROOM: PROBLEMS AND POSSIBILITIES, 1996, : 171 - 185
  • [37] EXPLAINING EXPLANATION - TENSION IN AMERICAN ANTHROPOLOGY
    FISHER, LE
    WERNER, O
    JOURNAL OF ANTHROPOLOGICAL RESEARCH, 1978, 34 (02) : 194 - 218
  • [38] Explaining training-induced performance increments and decrements within a unified framework of perceptual learning
    Sagi, D.
    Censor, N.
    PERCEPTION, 2009, 38 : 11 - 11
  • [39] A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes
    Kingsford, Carl
    Delcher, Arthur L.
    Salzberg, Steven L.
    MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (09) : 2091 - 2098
  • [40] An Edge Driven Model for Complex Networks: A Unified Framework
    Chen, Zengqiang
    Chen, Fei
    Liu, Zhongxin
    Xiang, Linying
    Yuan, Zhuzhi
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 4731 - 4735