Large-scale microbiome data integration enables robust biomarker identification

被引:48
|
作者
Xiao, Liwen [1 ,2 ]
Zhang, Fengyi [1 ]
Zhao, Fangqing [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Beijing Inst Life Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Hangzhou Inst Adv Study, Key Lab Syst Biol, Hangzhou, Peoples R China
[4] Chinese Acad Sci, Inst Zool, State Key Lab Integrated Management Pest Insects, Beijing, Peoples R China
来源
NATURE COMPUTATIONAL SCIENCE | 2022年 / 2卷 / 05期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
POPULATIONS; BACTERIA; NETWORK; MODELS;
D O I
10.1038/s43588-022-00247-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The close association between gut microbiota dysbiosis and human diseases is being increasingly recognized. However, contradictory results are frequently reported, as confounding effects exist. The lack of unbiased data integration methods is also impeding the discovery of disease-associated microbial biomarkers from different cohorts. Here we propose an algorithm, NetMoss, for assessing shifts of microbial network modules to identify robust biomarkers associated with various diseases. Compared to previous approaches, the NetMoss method shows better performance in removing batch effects. Through comprehensive evaluations on both simulated and real datasets, we demonstrate that NetMoss has great advantages in the identification of disease-related biomarkers. Based on analysis of pandisease microbiota studies, there is a high prevalence of multidisease-related bacteria in global populations. We believe that large-scale data integration will help in understanding the role of the microbiome from a more comprehensive perspective and that accurate biomarker identification will greatly promote microbiome-based medical diagnosis.
引用
收藏
页码:307 / 316
页数:10
相关论文
共 50 条
  • [21] Genomic and proteomic databases: Large-scale analysis and integration of data
    Cavalcoli, JD
    TRENDS IN CARDIOVASCULAR MEDICINE, 2001, 11 (02) : 76 - 81
  • [22] Anduril 2: upgraded large-scale data integration framework
    Cervera, Alejandra
    Rantanen, Ville
    Ovaska, Kristian
    Laakso, Marko
    Nunez-Fontarnau, Javier
    Alkodsi, Amjad
    Casado, Julia
    Facciotto, Chiara
    Hakkinen, Antti
    Louhimo, Riku
    Karinen, Sirkku
    Zhang, Kaiyang
    Lavikka, Kari
    Lyly, Lauri
    Singh, Maninder Pal
    Hautaniemi, Sampsa
    BIOINFORMATICS, 2019, 35 (19) : 3815 - 3817
  • [23] Schizconnect: Large-scale Schizophrenia Neuroimaging Data Integration and Sharing
    Wang, Lei
    Alpert, Kathryn
    Turner, Jessica
    Calhoun, Vince
    Keator, David
    King, Margaret
    Kogan, Alex
    Landis, Drew
    Tallis, Marcelo
    Potkin, Steven
    Turner, Jessica
    Ambite, Jose Luis
    NEUROPSYCHOPHARMACOLOGY, 2014, 39 : S571 - S572
  • [24] Microfluidic large-scale integration
    Thorsen, T
    Maerkl, SJ
    Quake, SR
    SCIENCE, 2002, 298 (5593) : 580 - 584
  • [25] KEGG for integration and interpretation of large-scale molecular data sets
    Kanehisa, Minoru
    Goto, Susumu
    Sato, Yoko
    Furumichi, Miho
    Tanabe, Mao
    NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D109 - D114
  • [26] OpenMS 3 enables reproducible analysis of large-scale mass spectrometry data
    Julianus Pfeuffer
    Chris Bielow
    Samuel Wein
    Kyowon Jeong
    Eugen Netz
    Axel Walter
    Oliver Alka
    Lars Nilse
    Pasquale Domenico Colaianni
    Douglas McCloskey
    Jihyung Kim
    George Rosenberger
    Leon Bichmann
    Mathias Walzer
    Johannes Veit
    Bertrand Boudaud
    Matthias Bernt
    Nikolaos Patikas
    Matteo Pilz
    Michał Piotr Startek
    Svetlana Kutuzova
    Lukas Heumos
    Joshua Charkow
    Justin Cyril Sing
    Ayesha Feroz
    Arslan Siraj
    Hendrik Weisser
    Tjeerd M. H. Dijkstra
    Yasset Perez-Riverol
    Hannes Röst
    Oliver Kohlbacher
    Timo Sachsenberg
    Nature Methods, 2024, 21 : 365 - 367
  • [27] The HTPmod Shiny application enables modeling and visualization of large-scale biological data
    Dijun Chen
    Liang-Yu Fu
    Dahui Hu
    Christian Klukas
    Ming Chen
    Kerstin Kaufmann
    Communications Biology, 1
  • [28] OpenMS 3 enables reproducible analysis of large-scale mass spectrometry data
    Pfeuffer, Julianus
    Bielow, Chris
    Wein, Samuel
    Jeong, Kyowon
    Netz, Eugen
    Walter, Axel
    Alka, Oliver
    Nilse, Lars
    Colaianni, Pasquale Domenico
    Mccloskey, Douglas
    Kim, Jihyung
    Rosenberger, George
    Bichmann, Leon
    Walzer, Mathias
    Veit, Johannes
    Boudaud, Bertrand
    Bernt, Matthias
    Patikas, Nikolaos
    Pilz, Matteo
    Startek, Michal Piotr
    Kutuzova, Svetlana
    Heumos, Lukas
    Charkow, Joshua
    Sing, Justin Cyril
    Feroz, Ayesha
    Siraj, Arslan
    Weisser, Hendrik
    Dijkstra, Tjeerd M. H.
    Perez-Riverol, Yasset
    Roest, Hannes
    Kohlbacher, Oliver
    Sachsenberg, Timo
    NATURE METHODS, 2024, 21 (3) : 365 - 367
  • [29] The HTPmod Shiny application enables modeling and visualization of large-scale biological data
    Chen, Dijun
    Fu, Liang-Yu
    Hu, Dahui
    Klukas, Christian
    Chen, Ming
    Kaufmann, Kerstin
    COMMUNICATIONS BIOLOGY, 2018, 1
  • [30] Large-scale and Robust Code Authorship Identification with Deep Feature Learning
    Abuhamad, Mohammed
    Abuhmed, Tamer
    Mohaisen, David
    Nyang, Daehun
    ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2021, 24 (04)