Benchmark of Data Processing Methods and Machine Learning Models for Gut Microbiome-Based Diagnosis of Inflammatory Bowel Disease

被引:18
|
作者
Kubinski, Ryszard [1 ]
Djamen-Kepaou, Jean-Yves [1 ]
Zhanabaev, Timur [1 ]
Hernandez-Garcia, Alex [2 ]
Bauer, Stefan [3 ]
Hildebrand, Falk [4 ,5 ]
Korcsmaros, Tamas [4 ,5 ]
Karam, Sani [1 ]
Jantchou, Prevost [6 ]
Kafi, Kamran [1 ]
Martin, Ryan D. [1 ]
机构
[1] Phyla Technol Inc, Montreal, PQ, Canada
[2] Univ Montreal, Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[3] Max Planck Inst Intelligent Syst, Tubingen, Germany
[4] Quadram Inst Biosci, Gut Microbes & Hlth, Norwich, Norfolk, England
[5] Earlham Inst, Norwich, Norfolk, England
[6] Ctr Hosp Univ St Justine, Montreal, PQ, Canada
基金
欧洲研究理事会; 欧盟地平线“2020”; 英国生物技术与生命科学研究理事会;
关键词
inflammatory bowel disease; machine learning; gut microbiome; batch effect reduction; data normalization; QIIME2; PICRUSt2; COMPOSITIONAL DATA; ULCERATIVE-COLITIS; RISK-FACTORS; DIVERSITY; DELAY; EXPRESSION; PREDICTION; THERAPY; IMPACT; SILVA;
D O I
10.3389/fgene.2022.784397
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Patients with inflammatory bowel disease (IBD) wait months and undergo numerous invasive procedures between the initial appearance of symptoms and receiving a diagnosis. In order to reduce time until diagnosis and improve patient wellbeing, machine learning algorithms capable of diagnosing IBD from the gut microbiome's composition are currently being explored. To date, these models have had limited clinical application due to decreased performance when applied to a new cohort of patient samples. Various methods have been developed to analyze microbiome data which may improve the generalizability of machine learning IBD diagnostic tests. With an abundance of methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (from data processing to training a machine learning model) for microbiome-based IBD diagnostic tools. We collected fifteen 16S rRNA microbiome datasets (7,707 samples) from North America to benchmark combinations of gut microbiome features, data normalization and transformation methods, batch effect correction methods, and machine learning models. Pipeline generalizability to new cohorts of patients was evaluated with two binary classification metrics following leave-one-dataset-out cross (LODO) validation, where all samples from one study were left out of the training set and tested upon. We demonstrate that taxonomic features processed with a compositional transformation method and batch effect correction with the naive zero-centering method attain the best classification performance. In addition, machine learning models that identify non-linear decision boundaries between labels are more generalizable than those that are linearly constrained. Lastly, we illustrate the importance of generating a curated training dataset to ensure similar performance across patient demographics. These findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] CJRB-201, a promising candidate for microbiome-based therapy with potent anti-inflammatory effects in Inflammatory Bowel Disease
    Shin, C.
    Park, J.
    Lee, S.
    Hyun, D. W.
    Kim, D. H.
    Jung, Y.
    Kim, J. M.
    Kim, J. H.
    Baek, M.
    Park, Y. J.
    Lee, H.
    Jang, H.
    Hong, D. H.
    Lee, J.
    Cho, K. H.
    Cha, I
    Yoo, J. A.
    Lee, K.
    Park, Y. R.
    Kwon, J. E.
    Hong, K. H.
    Huh, J. R.
    Chun, J.
    CJ Bioscience
    JOURNAL OF CROHNS & COLITIS, 2025, 19 : i2408 - i2409
  • [22] Diagnosis of Inflammatory Bowel Disease and Colorectal Cancer through Multi-View Stacked Generalization Applied on Gut Microbiome Data
    Imangaliyev, Sultan
    Schloetterer, Joerg
    Meyer, Folker
    Seifert, Christin
    DIAGNOSTICS, 2022, 12 (10)
  • [23] Machine Learning Based Metagenomic Prediction of Inflammatory Bowel Disease
    Mihajlovic, Andrea
    Mladenovic, Katarina
    Loncar-Turukalo, Tatjana
    Brdar, Sanja
    PHEALTH 2021, 2021, 285 : 165 - 170
  • [24] Clinical applications of artificial intelligence and machine learning-based methods in inflammatory bowel disease
    Cohen-Mekelburg, Shirley
    Berry, Sameer
    Stidham, Ryan W.
    Zhu, Ji
    Waljee, Akbar K.
    JOURNAL OF GASTROENTEROLOGY AND HEPATOLOGY, 2021, 36 (02) : 279 - 285
  • [25] Gut microbiome-based machine learning for diagnostic prediction of liver fibrosis and cirrhosis: a systematic review and meta-analysis
    Liu, Xiaopei
    Liu, Dan
    Tan, Cong'e
    Feng, Wenzhe
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [26] Gut microbiome-based machine learning for diagnostic prediction of liver fibrosis and cirrhosis: a systematic review and meta-analysis
    Xiaopei Liu
    Dan Liu
    Cong’e Tan
    Wenzhe Feng
    BMC Medical Informatics and Decision Making, 23
  • [27] Machine Learning-based Prediction Models for Diagnosis and Prognosis in Inflammatory Bowel Diseases: A Systematic Review
    Nguyen, Nghia H.
    Picetti, Dominic
    Dulai, Parambir S.
    Jairath, Vipul
    Sandborn, William J.
    Ohno-Machado, Lucila
    Chen, Peter L.
    Singh, Siddharth
    JOURNAL OF CROHNS & COLITIS, 2022, 16 (03): : 398 - 413
  • [29] Application of SWATH Mass Spectrometry and Machine Learning in the Diagnosis of Inflammatory Bowel Disease Based on the Stool Proteome
    Shajari, Elmira
    Gagne, David
    Malick, Mandy
    Roy, Patricia
    Noel, Jean-Francois
    Gagnon, Hugo
    Brunet, Marie A.
    Delisle, Maxime
    Boisvert, Francois-Michel
    Beaulieu, Jean-Francois
    BIOMEDICINES, 2024, 12 (02)
  • [30] Gut microbiome-based thiamine metabolism contributes to the protective effect of one acidic polysaccharide from Selaginella uncinata (Desv.) Spring against inflammatory bowel disease
    Hui, Haochen
    Wang, Zhuoya
    Zhao, Xuerong
    Xu, Lina
    Yin, Lianhong
    Wang, Feifei
    Qu, Liping
    Peng, Jinyong
    JOURNAL OF PHARMACEUTICAL ANALYSIS, 2024, 14 (02) : 177 - 195