Optimizing for Interpretability in Deep Neural Networks with Tree Regularization

被引:0
|
作者
Wu, Mike [1 ]
Parbhoo, Sonali [2 ]
Hughes, Michael C. [3 ]
Roth, Volker [4 ]
Doshi-Velez, Finale [2 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Harvard Univ, SEAS, Cambridge, MA 02138 USA
[3] Tufts Univ, Medford, MA 02153 USA
[4] Univ Basel, Basel, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep models have advanced prediction in many domains, but their lack of interpretability remains a key barrier to the adoption in many real world applications. There exists a large body of work aiming to help humans understand these black box functions to varying levels of granularity - for example, through distillation, gradients, or adversarial examples. These methods however, all tackle interpretability as a separate process after training. In this work, we take a different approach and explicitly regularize deep models so that they are well-approximated by processes that humans can step through in little time. Specifically, we train several families of deep neural networks to resemble compact, axis-aligned decision trees without significant compromises in accuracy. The resulting axis-aligned decision functions uniquely make tree regularized models easy for humans to interpret. Moreover, for situations in which a single, global tree is a poor estimator, we introduce a regional tree regularizer that encourages the deep model to resemble a compact, axis-aligned decision tree in predefined, human-interpretable contexts. Using intuitive toy examples, benchmark image datasets, and medical tasks for patients in critical care and with HIV, we demonstrate that this new family of tree regularizers yield models that are easier for humans to simulate than L-1 or L-2 penalties without sacrificing predictive power.
引用
收藏
页码:1 / 37
页数:37
相关论文
共 50 条
  • [1] Optimizing for interpretability in deep neural networks with tree regularization
    Wu, Mike
    Parbhoo, Sonali
    Hughes, Michael C.
    Roth, Volker
    Doshi-Velez, Finale
    [J]. Journal of Artificial Intelligence Research, 2021, 72
  • [2] Regional Tree Regularization for Interpretability in Deep Neural Networks
    Wu, Mike
    Parbhoo, Sonali
    Hughes, Michael C.
    Kindle, Ryan
    Celi, Leo
    Zazzi, Maurizio
    Roth, Volker
    Doshi-Velez, Finale
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6413 - 6421
  • [3] Batch-wise Regularization of Deep Neural Networks for Interpretability
    Burkart, Nadia
    Faller, Philipp M.
    Peinsipp, Elisabeth
    Huber, Marco F.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS (MFI), 2020, : 216 - 222
  • [4] Beyond Sparsity: Tree Regularization of Deep Models for Interpretability
    Wu, Mike
    Hughes, Michael C.
    Parbhoo, Sonali
    Zazzi, Maurizio
    Roth, Volker
    Doshi-Velez, Finale
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 1670 - 1678
  • [5] A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation
    Zhang, Qinglong
    Rao, Lu
    Yang, Yubin
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3377 - 3384
  • [6] New Perspective of Interpretability of Deep Neural Networks
    Kimura, Masanari
    Tanaka, Masayuki
    [J]. 2020 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGIES (ICICT 2020), 2020, : 78 - 85
  • [7] A Benchmark for Interpretability Methods in Deep Neural Networks
    Hooker, Sara
    Erhan, Dumitru
    Kindermans, Pieter-Jan
    Kim, Been
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] Threshout Regularization for Deep Neural Networks
    Williams, Travis
    Li, Robert
    [J]. SOUTHEASTCON 2021, 2021, : 728 - 735
  • [9] IMPROVING THE INTERPRETABILITY OF DEEP NEURAL NETWORKS WITH STIMULATED LEARNING
    Tan, Shawn
    Sim, Khe Chai
    Gales, Mark
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 617 - 623
  • [10] Improving Interpretability and Regularization in Deep Learning
    Wu, Chunyang
    Gales, Mark J. F.
    Ragni, Anton
    Karanasou, Penny
    Sim, Khe Chai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (02) : 256 - 265