Statistical methods for metagenomics data analysis

被引:5
|
作者
Lee, Chanyoung [1 ]
Lee, Seungyeoun [2 ]
Park, Taesung [3 ]
机构
[1] Seoul Natl Univ, Interdisciplinary Program Bioinformat, Seoul, South Korea
[2] Sejong Univ, Dept Math & Stat, Seoul, South Korea
[3] Seoul Natl Univ, Interdisciplinary Program Bioinformat, Dept Stat, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
DAFs; differentially abundant features; metagenome; microbiome; association test; statistical methods; 16S rRNA; OTU; operational taxonomic unit; taxa; logistic regression; DIFFERENTIAL EXPRESSION ANALYSIS; HELICOBACTER-PYLORI; ABUNDANCE ANALYSIS; HUMAN MICROBIOME; BETA REGRESSION; GUT MICROBIOTA; ASSOCIATION; RNA;
D O I
10.1504/IJDMB.2017.10012554
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the advent of next-generation sequencing (NGS) technology, sequencing of microbes now allows association analyses between genomic features and the environment. Several statistical methods have been proposed for analysing metagenome data. In this study, we proposed a novel method, Centred log ratio-transformed, Permutation-based Logistic regression (CPL), based on a logistic regression model that uses centred log-ratio transformation and permutation. Using CPL, we systematically compare the performances of various statistical methods for their ability to find differentially abundant features (DAFs). We first assessed the type I error rate of each method and compared power of each method over different levels of sparsity. Furthermore, we applied the various methods to real data of colorectal cancer (CRC), and compared the list of obtained taxonomic markers to the results of a previous CRC study. As a result, we recommend using CPL, metagenomeSeq and/or ANCOM, because they preserved type I error well, with comparable power.
引用
收藏
页码:366 / 385
页数:20
相关论文
共 50 条
  • [1] Statistical methods of data analysis
    Galanis, P.
    [J]. ARCHIVES OF HELLENIC MEDICINE, 2009, 26 (05): : 699 - 711
  • [2] Statistical methods for astronomical data analysis
    Modak, Soumita
    [J]. AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2023, 65 (04) : 394 - 395
  • [3] Statistical Analysis Methods for the fMRI Data
    Behroozi, Mehdi
    Daliri, Mohammad Reza
    Boyaci, Huseyin
    [J]. BASIC AND CLINICAL NEUROSCIENCE, 2011, 2 (04) : 67 - 74
  • [5] Statistical methods for comparing the abundances of metabolic pathways in metagenomics
    Liu, Bo
    Pop, Mihai
    [J]. GENOME BIOLOGY, 2010, 11
  • [6] Statistical methods for comparing the abundances of metabolic pathways in metagenomics
    Bo Liu
    Mihai Pop
    [J]. Genome Biology, 11 (Suppl 1):
  • [7] Statistical methods for categorical data analysis.
    Eliason, S
    [J]. SOCIOLOGICAL METHODS & RESEARCH, 2002, 30 (04) : 580 - 582
  • [8] Bayesian Statistical Methods in the Analysis of DEER Data
    Edwards, Thomas H.
    Stoll, Stefan
    [J]. BIOPHYSICAL JOURNAL, 2016, 110 (03) : 153A - 153A
  • [9] Selection of Appropriate Statistical Methods for Data Analysis
    Mishra, Prabhaker
    Pandey, Chandra Mani
    Singh, Uttam
    Keshri, Amit
    Sabaretnam, Mayilvaganan
    [J]. ANNALS OF CARDIAC ANAESTHESIA, 2019, 22 (03) : 297 - 301
  • [10] METHODS OF WEIGHTING HATCHING DATA IN STATISTICAL ANALYSIS
    BATEN, WD
    HENDERSON, EW
    [J]. POULTRY SCIENCE, 1958, 37 (04) : 781 - 785