MarkerML - Marker Feature Identification in Metagenomic Datasets Using Interpretable Machine Learning

被引:6
|
作者
Nagpal, Sunil [1 ,2 ,3 ]
Singh, Rohan [1 ]
Taneja, Bhupesh [2 ,3 ]
Mande, Sharmila S. [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Pune 411013, India
[2] CSIR, Inst Genom & Integrat Biol GIB, New Delhi 110025, India
[3] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, India
关键词
metagenomic biomarkers; interpretable machine learning; SHAP; microbiome; marker features; DATABASE;
D O I
10.1016/j.jmb.2022.167589
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identification of environment specific marker-features is one of the key objectives of many metagenomic studies. It aims to identify such features in microbiome datasets that may serve as markers of the contrasting or comparable states. Hypothesis testing and black-box machine learnt models which are conventionally used for identification of these features are generally not exhaustive, especially because they generally do-not provide any quantifiable relevance (context) of/between the identified features. We present MarkerML web-server, that seeks to leverage the emergence of interpretable machine learning for facilitating the contextual discovery of metagenomic features of interest. It does so through a comprehensive and automated application of the concept of Shapley Additive Explanations in companionship to the compositionality accounted hypothesis testing for the multi-variate microbiome datasets. MarkerML not only helps in identification of marker-features, but also enables insights into the role and interdependence of the identified features in driving the decision making of the supervised machine learnt model. Generation of high quality and intuitive visualizations spanning prediction effect plots, model performance reports, feature dependency plots, Shapley and abundance informed cladograms (Sungrams), hypothesis tested violin plots along-with necessary provisions for excluding the participant bias and ensuring reproducibility of results, further seek to make the platform a useful asset for the scientists in the field of microbiome (and even beyond). The MarkerML web-server is freely available for the academic community at https://microbiome.igib.res.in/markerml/.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Identification of new marker genes from plant single-cell RNA-seq data using interpretable machine learning methods
    Yan, Haidong
    Lee, Jiyoung
    Song, Qi
    Li, Qi
    Schiefelbein, John
    Zhao, Bingyu
    Li, Song
    NEW PHYTOLOGIST, 2022, 234 (04) : 1507 - 1520
  • [22] Interpretable federated learning for machine condition monitoring: Interpretable average global model as a fault feature library
    Feng, Xiao
    Wang, Dong
    Hou, Bingchang
    Yan, Tongtong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [23] Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights
    Pasolli, Edoardo
    Duy Tin Truong
    Malik, Faizan
    Waldron, Levi
    Segata, Nicola
    PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (07)
  • [24] Curated Datasets and Feature Analysis for Phishing Email Detection with Machine Learning
    Champa, Arifa I.
    Rabbi, Md Fazle
    Zibran, Minhaz F.
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [25] Prediction of phytoplankton biomass and identification of key influencing factors using interpretable machine learning models
    Xu, Yi
    Zhang, Di
    Lin, Junqiang
    Peng, Qidong
    Lei, Xiaohui
    Jin, Tiantian
    Wang, Jia
    Yuan, Ruifang
    ECOLOGICAL INDICATORS, 2024, 158
  • [26] Identification of texture MRI brain abnormalities on Fibromyalgia syndrome using interpretable machine learning models
    Hongyang Jiang
    Aihui Liu
    Zhenhua Ying
    Scientific Reports, 14 (1)
  • [27] Interpretable Machine Learning Using Partial Linear Models
    Flachaire, Emmanuel
    Hue, Sullivan
    Laurent, Sebastien
    Hacheme, Gilles
    OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2024, 86 (03) : 519 - 540
  • [28] Practical feature filter strategy to machine learning for small datasets in chemistry
    Hu, Yang
    Sandt, Roland
    Spatschek, Robert
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [29] Pest Presence Prediction Using Interpretable Machine Learning
    Nanushi, Ornela
    Sitokonstantinou, Vasileios
    Tsoumas, Ilias
    Kontoes, Charalampos
    2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [30] PhageScanner: a reconfigurable machine learning framework for bacteriophage genomic and metagenomic feature annotation
    Albin, Dreycey
    Ramsahoye, Michelle
    Kochavi, Eitan
    Alistar, Mirela
    FRONTIERS IN MICROBIOLOGY, 2024, 15