Why did AI get this one wrong? - Tree-based explanations of machine learning model predictions

被引:24
|
作者
Parimbelli, Enea [1 ,3 ,5 ]
Buonocore, Tommaso Mario [1 ]
Nicora, Giovanna [1 ,2 ]
Michalowski, Wojtek [3 ]
Wilk, Szymon [4 ]
Bellazzi, Riccardo [1 ]
机构
[1] Univ Pavia, Dept Elect Comp & Biomed Engn, Pavia, Italy
[2] enGenome srl, Pavia, Italy
[3] Univ Ottawa, Telfer Sch Management, Ottawa, ON, Canada
[4] Poznan Univ Tech, Inst Comp Sci, Div Intelligent Decis Support Syst, Poznan, Poland
[5] Dept Elect Comp & Biomed Engn, Via Ferrata 5, I-27100 Pavia, Italy
基金
欧盟地平线“2020”;
关键词
XAI; Black-box; Explanation; Local explanation; Interpretable; Explainable; Fidelity; Reliability; Post-hoc; Model agnostic; Surrogate model; DATASET SHIFT;
D O I
10.1016/j.artmed.2022.102471
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Increasingly complex learning methods such as boosting, bagging and deep learning have made ML models more accurate, but harder to interpret and explain, culminating in black-box machine learning models. Model developers and users alike are often presented with a trade-off between performance and intelligibility, especially in high-stakes applications like medicine. In the present article we propose a novel methodological approach for generating explanations for the predictions of a generic machine learning model, given a specific instance for which the prediction has been made. The method, named AraucanaXAI, is based on surrogate, locally-fitted classification and regression trees that are used to provide post-hoc explanations of the prediction of a generic machine learning model. Advantages of the proposed XAI approach include superior fidelity to the original model, ability to deal with non-linear decision boundaries, and native support to both classification and regression problems. We provide a packaged, open-source implementation of the AraucanaXAI method and evaluate its behaviour in a number of different settings that are commonly encountered in medical applications of AI. These include potential disagreement between the model prediction and physician's expert opinion and low reliability of the prediction due to data scarcity.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Tree-based machine learning approaches for equity market predictions
    Dominik Wolff
    Ulrich Neugebauer
    Journal of Asset Management, 2019, 20 : 273 - 288
  • [2] Tree-based machine learning approaches for equity market predictions
    Wolff, Dominik
    Neugebauer, Ulrich
    JOURNAL OF ASSET MANAGEMENT, 2019, 20 (04) : 273 - 288
  • [3] Discussion on the tree-based machine learning model in the study of landslide susceptibility
    Liu, Qiang
    Tang, Aiping
    Huang, Ziyuan
    Sun, Lixin
    Han, Xiaosheng
    NATURAL HAZARDS, 2022, 113 (02) : 887 - 911
  • [4] Discussion on the tree-based machine learning model in the study of landslide susceptibility
    Qiang Liu
    Aiping Tang
    Ziyuan Huang
    Lixin Sun
    Xiaosheng Han
    Natural Hazards, 2022, 113 : 887 - 911
  • [5] Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification
    Bifarin, Olatomiwa O.
    PLOS ONE, 2023, 18 (05):
  • [6] Fundamental error in tree-based machine learning model selection for reservoir characterisation
    Daniel Asante Otchere
    Energy Geoscience, 2024, 5 (02) : 218 - 228
  • [7] Fundamental error in tree-based machine learning model selection for reservoir characterisation
    Otchere, Daniel Asante
    ENERGY GEOSCIENCE, 2024, 5 (02):
  • [8] Protein pKa Prediction by Tree-Based Machine Learning
    Chen, Ada Y.
    Lee, Juyong
    Damjanovic, Ana
    Brooks, Bernard R.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
  • [9] Tree-based interpretable machine learning of the thermodynamic phases
    Yang, Jintao
    Cao, Junpeng
    PHYSICS LETTERS A, 2021, 412
  • [10] Runtime Optimizations for Tree-based Machine Learning Models
    Asadi, Nima
    Lin, Jimmy
    de Vries, Arjen P.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292