Crohn's Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome

被引:2
|
作者
Unal, Metehan [1 ]
Bostanci, Erkan [1 ]
Ozkul, Ceren [2 ]
Acici, Koray [3 ]
Asuroglu, Tunc [4 ]
Guzel, Mehmet Serdar [1 ]
机构
[1] Ankara Univ, Dept Comp Engn, TR-06830 Ankara, Turkiye
[2] Hacettepe Univ, Fac Pharm, Dept Pharmaceut Microbiol, TR-06110 Ankara, Turkiye
[3] Ankara Univ, Dept Artificial Intelligence & Data Engn, TR-06830 Ankara, Turkiye
[4] Tampere Univ, Fac Med & Hlth Technol, FI-33720 Tampere, Finland
关键词
microbiota; Machine Learning; bowel disease; bioinformatics; ALGORITHMS;
D O I
10.3390/diagnostics13172835
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Human microbiota refers to the trillions of microorganisms that inhabit our bodies and have been discovered to have a substantial impact on human health and disease. By sampling the microbiota, it is possible to generate massive quantities of data for analysis using Machine Learning algorithms. In this study, we employed several modern Machine Learning techniques to predict Inflammatory Bowel Disease using raw sequence data. The dataset was obtained from NCBI preprocessed graph representations and converted into a structured form. Seven well-known Machine Learning frameworks, including Random Forest, Support Vector Machines, Extreme Gradient Boosting, Light Gradient Boosting Machine, Gaussian Naive Bayes, Logistic Regression, and k-Nearest Neighbor, were used. Grid Search was employed for hyperparameter optimization. The performance of the Machine Learning models was evaluated using various metrics such as accuracy, precision, fscore, kappa, and area under the receiver operating characteristic curve. Additionally, Mc Nemar's test was conducted to assess the statistical significance of the experiment. The data was constructed using k-mer lengths of 3, 4 and 5. The Light Gradient Boosting Machine model overperformed over other models with 67.24%, 74.63% and 76.47% accuracy for k-mer lengths of 3, 4 and 5, respectively. The LightGBM model also demonstrated the best performance in each metric. The study showed promising results predicting disease from raw sequence data. Finally, Mc Nemar's test results found statistically significant differences between different Machine Learning approaches.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] A Machine Learning-Based Diagnostic Model for Crohn's Disease and Ulcerative Colitis Utilizing Fecal Microbiome Analysis
    Kim, Hyeonwoo
    Na, Ji Eun
    Kim, Sangsoo
    Kim, Tae-Oh
    Park, Soo-Kyung
    Lee, Chil-Woo
    Kim, Kyeong Ok
    Seo, Geom-Seog
    Kim, Min Suk
    Cha, Jae Myung
    Koo, Ja Seol
    Park, Dong-Il
    MICROORGANISMS, 2024, 12 (01)
  • [2] Human limits in machine learning: prediction of potato yield and disease using soil microbiome data
    Rosa Aghdam
    Xudong Tang
    Shan Shan
    Richard Lankau
    Claudia Solís-Lemus
    BMC Bioinformatics, 25 (1)
  • [3] Prediction of the activity of Crohn's disease based on CT radiomics combined with machine learning models
    Li, Tingting
    Liu, Yu
    Guo, Jiuhong
    Wang, Yuanjun
    JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY, 2022, 30 (06) : 1155 - 1168
  • [4] Analysis of oral microbiome in glaucoma patients using machine learning prediction models
    Yoon, Byung Woo
    Lim, Su-Ho
    Shin, Jong Hoon
    Lee, Ji-Woong
    Lee, Young
    Seo, Je Hyun
    JOURNAL OF ORAL MICROBIOLOGY, 2021, 13 (01)
  • [5] Machine learning-based approaches for cancer prediction using microbiome data
    Freitas, Pedro
    Silva, Francisco
    Sousa, Joana Vale
    Ferreira, Rui M.
    Figueiredo, Ceu
    Pereira, Tania
    Oliveira, Helder P.
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [6] Machine learning-based approaches for cancer prediction using microbiome data
    Pedro Freitas
    Francisco Silva
    Joana Vale Sousa
    Rui M. Ferreira
    Céu Figueiredo
    Tania Pereira
    Hélder P. Oliveira
    Scientific Reports, 13 (1)
  • [7] SPACIOTMEPORAL MACHINE LEARNING ANALYSIS OF COMPLETE SMALL BOWEL ENDOSCOPY VIDEOS FOR PREDICTION OF OUTCOMES IN CROHN'S DISEASE
    Kellerman, Raizy
    Bleiweiss, Amit
    Samuel, Shimrit
    Margalit-Yehuda, Reuma
    Barzilay, Oranit
    Ben-Horin, Shomron
    Eliakim, Rami
    Klang, Eyal
    Kopylov, Uri
    GASTROINTESTINAL ENDOSCOPY, 2022, 95 (06) : AB472 - AB473
  • [8] Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning
    Wang, Haobo
    Chen, Xuemin
    Li, Can
    Liu, Yuan
    Yang, Fan
    Wang, Chu
    BIOCHEMISTRY, 2018, 57 (04) : 451 - 460
  • [9] Sequence-based analysis and prediction of lantibiotics: A machine learning approach
    Poorinmohammad, Naghmeh
    Hamedi, Javad
    Moghaddam, Mohammad Hossein Abbaspour Motlagh
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2018, 77 : 199 - 206
  • [10] Parkinson's Disease Data Analysis and Prediction Using Ensemble Machine Learning Techniques
    Mali, Rubash
    Sipai, Sushila
    Mali, Drish
    Shakya, Subarna
    MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 327 - 339