A Comprehensive Study on Predicting Functional Role of Metagenomes Using Machine Learning Methods

被引:8
|
作者
Wassan, Jyotsna Talreja [1 ]
Wang, Haiying [1 ]
Browne, Fiona [1 ]
Zheng, Huiru [1 ]
机构
[1] Ulster Univ, Sch Comp, Belfast BT37 0QB, Antrim, North Ireland
基金
欧盟地平线“2020”;
关键词
Metagenomics; microbiota; embedded feature selection; operational taxonomic units (OTUs); classification; SUPERVISED CLASSIFICATION; HUMAN MICROBIOME;
D O I
10.1109/TCBB.2018.2858808
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
"Metagenomics" is the study of genomic sequences obtained directly from environmental microbial communities with the aim to linking their structures with functional roles. The field has been aided in the unprecedented advancement through high-throughput omics data sequencing. The outcome of sequencing are biologically rich data sets. Metagenomic data consisting of microbial species which outnumber microbial samples, lead to the 'curse of dimensionality" in datasets. Hence, the focus in metagenomics studies has moved towards developing efficient computational models using Machine Learning (ML), reducing the computational cost. In this paper, we comprehensively assessed various ML approaches to classifying high-dimensional human microbiota effectively into their functional phenotypes. We propose the application of embedded feature selection methods, namely, Extreme Gradient Boosting and Penalized Logistic Regression to determine important microbial species. The resultant feature set enhanced the performance of one of the most popular state-of-the-art methods, Random Forest (RF) over metagenomic studies. Experimental results indicate that the proposed method achieved best results in terms of accuracy, area under the Receiver Operating Characteristic curve (ROC-AUC), and major improvement in processing time. It outperformed other feature selection methods of filters or wrappers over RF and classifiers such as Support Vector Machine (SVM), Extreme Learning Machine (ELM), and k- Nearest Neighbors (k-NN).
引用
收藏
页码:751 / 763
页数:13
相关论文
共 50 条
  • [1] A Comprehensive Review of Predicting the Thermophysical Properties of Nanofluids Using Machine Learning Methods
    Wang, Helin
    Chen, Xueye
    [J]. INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2022, 61 (40) : 14711 - 14730
  • [2] Predicting cervical cancer using machine learning methods
    Alsmariy, Riham
    Healy, Graham
    Abdelhafez, Hoda
    [J]. 1600, Science and Information Organization (11): : 173 - 184
  • [3] Predicting bid prices by using machine learning methods
    Kim, Jong-Min
    Jung, Hojin
    [J]. APPLIED ECONOMICS, 2019, 51 (19) : 2011 - 2018
  • [4] Predicting the concentration of sulfate using machine learning methods
    Hichem Tahraoui
    Abd-Elmouneïm Belhadj
    Abdeltif Amrane
    Essam H. Houssein
    [J]. Earth Science Informatics, 2022, 15 : 1023 - 1044
  • [5] Predicting Cervical Cancer using Machine Learning Methods
    Alsmariy, Riham
    Healy, Graham
    Abdelhafez, Hoda
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (07) : 173 - 184
  • [6] Predicting the concentration of sulfate using machine learning methods
    Tahraoui, Hichem
    Belhadj, Abd-Elmouneim
    Amrane, Abdeltif
    Houssein, Essam H.
    [J]. EARTH SCIENCE INFORMATICS, 2022, 15 (02) : 1023 - 1044
  • [7] Comprehensive Functional Annotation of Metagenomes and Microbial Genomes Using a Deep Learning-Based Method
    Maranga, Mary
    Szczerbiak, Pawel
    Bezshapkin, Valentyn
    Gligorijevic, Vladimir
    Chandler, Chris
    Bonneau, Richard
    Xavier, Ramnik J.
    Vatanen, Tommi
    Kosciolek, Tomasz
    [J]. MSYSTEMS, 2023, 8 (02)
  • [8] A COMPREHENSIVE COMPARATIVE STUDY OF MACHINE LEARNING MODELS FOR PREDICTING CRYPTOCURRENCY
    Uenvan, Yueksel Akay
    Ergenc, Cansu
    [J]. FACTA UNIVERSITATIS-SERIES ELECTRONICS AND ENERGETICS, 2024, 37 (01)
  • [9] Predicting functional effects of ion channel variants using new phenotypic machine learning methods
    Bosselmann, Christian Malte
    Hedrich, Ulrike B. S.
    Lerche, Holger
    Pfeifer, Nico
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (03)
  • [10] Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides
    Xu, Jing
    Li, Fuyi
    Leier, Andre
    Xiang, Dongxu
    Shen, Hsin-Hui
    Lago, Tatiana T. Marquez
    Li, Jian
    Yu, Dong-Jun
    Song, Jiangning
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)