Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model

被引:18
|
作者
Sun, Xiaoxiao [1 ]
Dalpiaz, David [2 ]
Wu, Di [3 ]
Liu, Jun S. [3 ]
Zhong, Wenxuan [1 ]
Ma, Ping [1 ]
机构
[1] Univ Georgia, Dept Stat, 101 Cedar St, Athens, GA 30602 USA
[2] Univ Illinois, Dept Stat, 725 South Wright St, Champaign, IL 61820 USA
[3] Harvard Univ, Dept Stat, One Oxford St, Cambridge, MA 02138 USA
来源
BMC BIOINFORMATICS | 2016年 / 17卷
基金
美国国家科学基金会;
关键词
Differentially expressed gene; Gene set enrichment; Analysis of variance; Smoothing spline; Penalized likelihood; BIOCONDUCTOR PACKAGE; EXPRESSION ANALYSIS; GENE;
D O I
10.1186/s12859-016-1180-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. Results: Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. Conclusions: The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model
    Xiaoxiao Sun
    David Dalpiaz
    Di Wu
    Jun S. Liu
    Wenxuan Zhong
    Ping Ma
    [J]. BMC Bioinformatics, 17
  • [2] Negative binomial additive model for RNA-Seq data analysis
    Xu Ren
    Pei-Fen Kuan
    [J]. BMC Bioinformatics, 21
  • [3] Negative binomial additive model for RNA-Seq data analysis
    Ren Xu
    Kuan Pei-Fen
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [4] A sparse negative binomial mixture model for clustering RNA-seq count data
    Li, Yujia
    Rahman, Tanbin
    Ma, Tianzhou
    Tang, Lu
    Tseng, George C.
    [J]. BIOSTATISTICS, 2022, 24 (01) : 68 - 84
  • [5] Bayesian Analysis of RNA-Seq Data Using a Family of Negative Binomial Models
    Zhao, Lili
    Wu, Weisheng
    Feng, Dai
    Jiang, Hui
    Nguyen, XuanLong
    [J]. BAYESIAN ANALYSIS, 2018, 13 (02): : 411 - 436
  • [6] A SPARSE NEGATIVE BINOMIAL CLASSIFIER WITH COVARIATE ADJUSTMENT FOR RNA-SEQ DATA
    Rahman, Tanbin
    Huang, Hsin-En
    Li, Yujia
    Tai, An-Shun
    Hseih, Wen-Ping
    McClung, Colleen A.
    Tseng, George
    [J]. ANNALS OF APPLIED STATISTICS, 2022, 16 (02): : 1071 - 1089
  • [7] NBLDA: negative binomial linear discriminant analysis for RNA-Seq data
    Kai Dong
    Hongyu Zhao
    Tiejun Tong
    Xiang Wan
    [J]. BMC Bioinformatics, 17
  • [8] NBLDA: negative binomial linear discriminant analysis for RNA-Seq data
    Dong, Kai
    Zhao, Hongyu
    Tong, Tiejun
    Wan, Xiang
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [9] Sample size calculations for the differential expression analysis of RNA-seq data using a negative binomial regression model
    Li, Xiaohong
    Wu, Dongfeng
    Cooper, Nigel G. F.
    Rai, Shesh N.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [10] Marginal likelihood estimation of negative binomial parameters with applications to RNA-seq data
    Leon-Novelo, Luis
    Fuentes, Claudio
    Emerson, Sarah
    [J]. BIOSTATISTICS, 2017, 18 (04) : 637 - 650