Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox

被引:110
|
作者
Wirbel, Jakob [1 ]
Zych, Konrad [1 ,2 ]
Essex, Morgan [1 ,3 ,4 ]
Karcher, Nicolai [1 ,5 ]
Kartal, Ece [1 ]
Salazar, Guillem [6 ,7 ]
Bork, Peer [1 ,8 ,9 ,10 ]
Sunagawa, Shinichi [6 ,7 ]
Zeller, Georg [1 ]
机构
[1] European Mol Biol Lab EMBL, Struct & Computat Biol Unit, D-69117 Heidelberg, Germany
[2] Clin Microbiom AS, Ole Maaloes Vej 3, DK-2200 Copenhagen, Denmark
[3] Max Delbruck Ctr Mol Med, Expt & Clin Res Ctr ECRC, D-13125 Berlin, Germany
[4] Charite, D-13125 Berlin, Germany
[5] Univ Trento, Dept CIBIO, I-38123 Trento, Italy
[6] Swiss Fed Inst Technol, Inst Microbiol, Dept Biol, CH-8093 Zurich, Switzerland
[7] Swiss Fed Inst Technol, Swiss Inst Bioinformat, CH-8093 Zurich, Switzerland
[8] Mol Med Partnership Unit, Heidelberg, Germany
[9] Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[10] Univ Wurzburg, Dept Bioinformat, Bioctr, D-97074 Wurzburg, Germany
关键词
Microbiome data analysis; Machine learning; Statistical modeling; Microbiome-wide association studies (MWAS); Meta-analysis; INFLAMMATORY-BOWEL-DISEASE; HUMAN GUT MICROBIOME; INTESTINAL MICROBIOTA; PARKINSONS-DISEASE; FECAL MICROBIOTA; COMMUNITY; ASSOCIATION; METAGENOME; CLASSIFICATION; PREDICTION;
D O I
10.1186/s13059-021-02306-1
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers using machine learning (ML). However, metagenomics-specific software is scarce, and overoptimistic evaluation and limited cross-study generalization are prevailing issues. To address these, we developed SIAMCAT, a versatile R toolbox for ML-based comparative metagenomics. We demonstrate its capabilities in a meta-analysis of fecal metagenomic studies (10,803 samples). When naively transferred across studies, ML models lost accuracy and disease specificity, which could however be resolved by a novel training set augmentation strategy. This reveals some biomarkers to be disease-specific, with others shared across multiple conditions. SIAMCAT is freely available from siamcat.embl.de.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox
    Jakob Wirbel
    Konrad Zych
    Morgan Essex
    Nicolai Karcher
    Ece Kartal
    Guillem Salazar
    Peer Bork
    Shinichi Sunagawa
    Georg Zeller
    [J]. Genome Biology, 22
  • [2] A toolbox of machine learning software to support microbiome analysis
    Marcos-Zambrano, Laura Judith
    Lopez-Molina, Victor Manuel
    Bakir-Gungor, Burcu
    Frohme, Marcus
    Karaduzovic-Hadziabdic, Kanita
    Klammsteiner, Thomas
    Ibrahimi, Eliana
    Lahti, Leo
    Loncar-Turukalo, Tatjana
    Dhamo, Xhilda
    Simeon, Andrea
    Nechyporenko, Alina
    Pio, Gianvito
    Przymus, Piotr
    Sampri, Alexia
    Trajkovik, Vladimir
    Lacruz-Pleguezuelos, Blanca
    Aasmets, Oliver
    Araujo, Ricardo
    Anagnostopoulos, Ioannis
    Aydemir, Onder
    Berland, Magali
    Calle, M. Luz
    Ceci, Michelangelo
    Duman, Hatice
    Gundogdu, Aycan
    Havulinna, Aki S.
    Kaka Bra, Kardokh Hama Najib
    Kalluci, Eglantina
    Karav, Sercan
    Lode, Daniel
    Lopes, Marta B.
    May, Patrick
    Nap, Bram
    Nedyalkova, Miroslava
    Paciencia, Ines
    Pasic, Lejla
    Pujolassos, Meritxell
    Shigdel, Rajesh
    Susin, Antonio
    Thiele, Ines
    Truica, Ciprian-Octavian
    Wilmes, Paul
    Yilmaz, Ercument
    Yousef, Malik
    Claesson, Marcus Joakim
    Truu, Jaak
    Carrillo de Santa Pau, Enrique
    [J]. FRONTIERS IN MICROBIOLOGY, 2023, 14
  • [3] CROSS-DISEASE META-ANALYSIS IN FOUR SYSTEMIC AUTOIMMUNE DISEASES TO IDENTIFY SHARED GENETIC ETIOLOGIES
    Acosta-Herrera, M.
    Kerick, M.
    Gonzalez-Sernal, D.
    Wijmenga, C.
    Franke, A.
    Padyukov, L.
    Vyse, T.
    Alarcon-Riquelme, M. E.
    Mayes, M. D.
    Martin, J.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2018, 77 : 189 - 189
  • [4] Qiita: rapid, web-enabled microbiome meta-analysis
    Antonio Gonzalez
    Jose A. Navas-Molina
    Tomasz Kosciolek
    Daniel McDonald
    Yoshiki Vázquez-Baeza
    Gail Ackermann
    Jeff DeReus
    Stefan Janssen
    Austin D. Swafford
    Stephanie B. Orchanian
    Jon G. Sanders
    Joshua Shorenstein
    Hannes Holste
    Semar Petrus
    Adam Robbins-Pianka
    Colin J. Brislawn
    Mingxun Wang
    Jai Ram Rideout
    Evan Bolyen
    Matthew Dillon
    J. Gregory Caporaso
    Pieter C. Dorrestein
    Rob Knight
    [J]. Nature Methods, 2018, 15 : 796 - 798
  • [5] Qiita: rapid, web-enabled microbiome meta-analysis
    Gonzalez, Antonio
    Navas-Molina, Jose A.
    Kosciolek, Tomasz
    McDonald, Daniel
    Vazquez-Baeza, Yoshiki
    Ackermann, Gail
    DeReus, Jeff
    Janssen, Stefan
    Swafford, Austin D.
    Orchanian, Stephanie B.
    Sanders, Jon G.
    Shorenstein, Joshua
    Holste, Hannes
    Petrus, Semar
    Robbins-Pianka, Adam
    Brislawn, Colin J.
    Wang, Mingxun
    Rideout, Jai Ram
    Bolyen, Evan
    Dillon, Matthew
    Caporaso, J. Gregory
    Dorrestein, Pieter C.
    Knight, Rob
    [J]. NATURE METHODS, 2018, 15 (10) : 796 - +
  • [6] A Meta-Analysis of Overfitting in Machine Learning
    Roelofs, Rebecca
    Fridovich-Keil, Sara
    Miller, John
    Shankar, Vaishaal
    Hardt, Moritz
    Recht, Benjamin
    Schmidt, Ludwig
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] A Toolbox for Functional Analysis and the Systematic Identification of Diagnostic and Prognostic Gene Expression Signatures Combining Meta-Analysis and Machine Learning
    Vey, Johannes
    Kapsner, Lorenz A.
    Fuchs, Maximilian
    Unberath, Philipp
    Veronesi, Giulia
    Kunz, Meik
    [J]. CANCERS, 2019, 11 (10)
  • [8] Predicting the Role of the Human Gut Microbiome in Constipation Using Machine-Learning Methods: A Meta-Analysis
    Chen, Yutao
    Wu, Tong
    Lu, Wenwei
    Yuan, Weiwei
    Pan, Mingluo
    Lee, Yuan-Kun
    Zhao, Jianxin
    Zhang, Hao
    Chen, Wei
    Zhu, Jinlin
    Wang, Hongchao
    [J]. MICROORGANISMS, 2021, 9 (10)
  • [9] A direct comparison of theory-driven and machine learning prediction of suicide: A meta-analysis
    Schafer, Katherine M.
    Kennedy, Grace
    Gallyer, Austin
    Resnik, Philip
    [J]. PLOS ONE, 2021, 16 (04):
  • [10] Effect of bovine respiratory disease on the respiratory microbiome: a meta-analysis
    Howe, Samantha
    Kegley, Beth
    Powell, Jeremy
    Chen, Shicheng
    Zhao, Jiangchao
    [J]. FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY, 2023, 13