Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification

被引:25
|
作者
Bifarin, Olatomiwa O. [1 ,2 ]
机构
[1] Univ Georgia, Dept Biochem & Mol Biol, Athens, GA 30602 USA
[2] Georgia Inst Technol, Sch Chem & Biochem, Atlanta, GA 30602 USA
来源
PLOS ONE | 2023年 / 18卷 / 05期
关键词
METABOLIGHTS; REPOSITORY;
D O I
10.1371/journal.pone.0284315
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning (ML) models are used in clinical metabolomics studies most notably for biomarker discoveries, to identify metabolites that discriminate between a case and control group. To improve understanding of the underlying biomedical problem and to bolster confidence in these discoveries, model interpretability is germane. In metabolomics, partial least square discriminant analysis (PLS-DA) and its variants are widely used, partly due to the model's interpretability with the Variable Influence in Projection (VIP) scores, a global interpretable method. Herein, Tree-based Shapley Additive explanations (SHAP), an interpretable ML method grounded in game theory, was used to explain ML models with local explanation properties. In this study, ML experiments (binary classification) were conducted for three published metabolomics datasets using PLS-DA, random forests, gradient boosting, and extreme gradient boosting (XGBoost). Using one of the datasets, PLS-DA model was explained using VIP scores, while one of the best-performing models, a random forest model, was interpreted using Tree SHAP. The results show that SHAP has a more explanation depth than PLS-DA's VIP, making it a powerful method for rationalizing machine learning predictions from metabolomics studies.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Optimizing Binary Decision Diagrams for Interpretable Machine Learning Classification
    Cabodi, Gianpiero
    Camurati, Paolo E.
    Marques-Silva, Joao
    Palena, Marco
    Pasini, Paolo
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (10) : 3083 - 3087
  • [32] Assessing influential factors of Chinese industrial aqueous cadmium emissions based on machine learning and shapley additive explanations
    Yang, Guangfei
    Ju, Yi
    Wu, Wenjun
    Guo, Zitong
    Ni, Wenli
    JOURNAL OF CLEANER PRODUCTION, 2024, 448
  • [33] Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination
    Murata, Takeshi
    Yanagisawa, Takako
    Kurihara, Toshiaki
    Kaneko, Miku
    Ota, Sana
    Enomoto, Ayame
    Tomita, Masaru
    Sugimoto, Masahiro
    Sunamura, Makoto
    Hayashida, Tetsu
    Kitagawa, Yuko
    Jinno, Hiromitsu
    BREAST CANCER RESEARCH AND TREATMENT, 2019, 177 (03) : 591 - 601
  • [34] Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination
    Takeshi Murata
    Takako Yanagisawa
    Toshiaki Kurihara
    Miku Kaneko
    Sana Ota
    Ayame Enomoto
    Masaru Tomita
    Masahiro Sugimoto
    Makoto Sunamura
    Tetsu Hayashida
    Yuko Kitagawa
    Hiromitsu Jinno
    Breast Cancer Research and Treatment, 2019, 177 : 591 - 601
  • [35] Credit risk assessment of automobile loans using machine learning-based SHapley Additive exPlanations approach
    Lin, Shuoyan
    Song, Dandan
    Cao, Boyi
    Gu, Xin
    Li, Jiazhan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
  • [36] Predicting egg production rate and egg weight of broiler breeders based on machine learning and Shapley additive explanations
    Ji, Hengyi
    Xu, Yidan
    Teng, Ganghui
    POULTRY SCIENCE, 2025, 104 (01)
  • [37] Why did AI get this one wrong? - Tree-based explanations of machine learning model predictions
    Parimbelli, Enea
    Buonocore, Tommaso Mario
    Nicora, Giovanna
    Michalowski, Wojtek
    Wilk, Szymon
    Bellazzi, Riccardo
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 135
  • [38] Precise Prediction of Biochar Yield and Proximate Analysis by Modern Machine Learning and SHapley Additive exPlanations
    Le, Anh Tuan
    Pandey, Ashok
    Sirohi, Ranjan
    Sharma, Prabhakar
    Chen, Wei-Hsin
    Pham, Nguyen Dang Khoa
    Tran, Viet Dung
    Nguyen, Xuan Phuong
    Hoang, Anh Tuan
    ENERGY & FUELS, 2023, 37 (22) : 17310 - 17327
  • [39] Prediction of Biodiesel Yield Employing Machine Learning: Interpretability Analysis via Shapley Additive Explanations
    Agrawal, Pragati
    Gnanaprakash, R.
    Dhawane, Sumit H.
    FUEL, 2024, 359
  • [40] Prediction Models for Postoperative Delirium in Elderly Patients with Machine Learning Algorithms and Shapley Additive Explanations
    Cao, Jiangbei
    Song, Yuxiang
    Wang, Qian
    Mi, Weidong
    ANESTHESIA AND ANALGESIA, 2024, 139 (06): : 2777 - 2778