Interpretability in healthcare: A comparative study of local machine learning interpretability techniques

被引:67
|
作者
ElShawi, Radwa [1 ]
Sherif, Youssef [1 ]
Al-Mallah, Mouaz [2 ]
Sakr, Sherif [1 ]
机构
[1] Tartu Univ, Tartu, Estonia
[2] Houston Methodist Ctr, Houston, TX USA
关键词
big data; data science; interpretability; machine learning;
D O I
10.1111/coin.12410
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although complex machine learning models (eg, random forest, neural networks) are commonly outperforming the traditional and simple interpretable models (eg, linear regression, decision tree), in the healthcare domain, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. With the new general data protection regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. Hence, interpretability techniques for machine learning models are an area focus of research. In general, the main aim of these interpretability techniques is to shed light and provide insights into the prediction process of the machine learning models and to be able to explain how the results from the prediction was generated. A major problem in this context is that both the quality of the interpretability techniques and trust of the machine learning model predictions are challenging to measure. In this article, we propose four fundamental quantitative measures for assessing the quality of interpretability techniques-similarity, bias detection, execution time, and trust. We present a comprehensive experimental evaluation of six recent and popular local model agnostic interpretability techniques, namely, LIME, SHAP, Anchors, LORE, ILIME" and MAPLE on different types of real-world healthcare data. Building on previous work, our experimental evaluation covers different aspects for its comparison including identity, stability, separability, similarity, execution time, bias detection, and trust. The results of our experiments show that MAPLE achieves the highest performance for the identity across all data sets included in this study, while LIME achieves the lowest performance for the identity metric. LIME achieves the highest performance for the separability metric across all data sets. On average, SHAP has the smallest average time to output explanation across all data sets included in this study. For detecting the bias, SHAP and MAPLE enable the participants to better detect the bias. For the trust metric, Anchors achieves the highest performance on all data sets included in this work.
引用
收藏
页码:1633 / 1650
页数:18
相关论文
共 50 条
  • [1] Interpretability in HealthCare: A Comparative Study of Local Machine Learning Interpretability Techniques
    El Shawi, Radwa
    Sherif, Youssef
    Al-Mallah, Mouaz
    Sakr, Sherif
    [J]. 2019 IEEE 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2019, : 275 - 280
  • [2] A Study on Interpretability of Decision of Machine Learning
    Shirataki, Shohei
    Yamaguchi, Saneyasu
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4830 - 4831
  • [3] Survey on Techniques, Applications and Security of Machine Learning Interpretability
    Ji, Shouling
    Li, Jinfeng
    Du, Tianyu
    Li, Bo
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (10): : 2071 - 2096
  • [4] Against Interpretability: a Critical Examination of the Interpretability Problem in Machine Learning
    Krishnan M.
    [J]. Philosophy & Technology, 2020, 33 (3) : 487 - 502
  • [5] Interpretability of machine learning-based prediction models in healthcare
    Stiglic, Gregor
    Kocbek, Primoz
    Fijacko, Nino
    Zitnik, Marinka
    Verbert, Katrien
    Cilar, Leona
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (05)
  • [6] Determinants of the price of bitcoin: An analysis with machine learning and interpretability techniques*,**
    Carbo, Jose Manuel
    Gorjon, Sergio
    [J]. INTERNATIONAL REVIEW OF ECONOMICS & FINANCE, 2024, 92 : 123 - 140
  • [7] Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare
    Imrie, Fergus
    Davis, Robert
    van der Schaar, Mihaela
    [J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (08) : 824 - 829
  • [8] Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare
    Fergus Imrie
    Robert Davis
    Mihaela van der Schaar
    [J]. Nature Machine Intelligence, 2023, 5 : 824 - 829
  • [9] A Review of Framework for Machine Learning Interpretability
    Araujo, Ivo de Abreu
    Torres, Renato Hidaka
    Sampaio Neto, Nelson Cruz
    [J]. AUGMENTED COGNITION, AC 2022, 2022, 13310 : 261 - 272
  • [10] Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning
    Kaur, Harmanpreet
    Nori, Harsha
    Jenkins, Samuel
    Caruana, Rich
    Wallach, Hanna
    Vaughan, Jennifer Wortman
    [J]. PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,