A Unified Model for Robust Differential Expression Analysis of RNA-Seq Data

被引:0
|
作者
Liu, Kefei [1 ]
Shen, Li [1 ]
Jiang, Hui [2 ]
机构
[1] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelphia, PA 19104 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
关键词
Differential expression analysis; Inter-sample normalization; Linear regression; L0 sparsity regularization; RNA-seq; PROSTATE-CANCER; DOWN-REGULATION; NORMALIZATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A fundamental task for RNA-seq data analysis is to determine whether the RNA-seq read counts for a gene or exon are significantly different across experimental conditions. Since the RNA-seq measurements are relative in nature, between-sample normalization of counts is an essential step in differential expression (DE) analysis. In most existing methods the normalization step is independent of DE analysis, which is not well justified since ideally normalization should be based on non-DE genes only. Recently, Jiang and Zhan proposed a robust statistical model for joint between-sample normalization and DE analysis from log-transformed RNA-seq data. Sample-specific normalization factors are modeled as unknown parameters in the gene-wise linear models, and the L0 penalty is introduced to induce sparsity in the regression coefficients. In their model, the experimental conditions are assumed to be categorical (e.g., 0 for control and 1 for case), and one-way analysis of variance (ANOVA) is used to identify genes that are differentially expressed between two or more conditions. In this work, Jiang and Zhan's model is generalized to accommodate continuous/numerical experimental conditions, and a linear regression model is used to detect genes for which the expression level is significantly affected by the experimental conditions. Furthermore, an efficient algorithm is developed to solve for the global solution of the resultant high-dimensional, non-convex and non-differentiable penalized least squares regression problem. Extensive simulation studies and a real RNA-seq data example show that when the proportion of DE genes is small or the numbers of up-and down-regulated genes are approximately equal the proposed method performs similarly to existing methods in terms of detection power and false positive rate. When a large proportion (e.g., > 30%) of genes are differentially expressed in an asymmetric manner, it outperforms existing methods and the performance gain is even more substantial as the sample size increases.
引用
下载
收藏
页码:437 / 442
页数:6
相关论文
共 50 条
  • [41] Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)
    Grant, Gregory R.
    Farkas, Michael H.
    Pizarro, Angel D.
    Lahens, Nicholas F.
    Schug, Jonathan
    Brunk, Brian P.
    Stoeckert, Christian J.
    Hogenesch, John B.
    Pierce, Eric A.
    BIOINFORMATICS, 2011, 27 (18) : 2518 - 2528
  • [42] An empirical likelihood ratio test robust to individual heterogeneity for differential expression analysis of RNA-seq
    Xu, Maoqi
    Chen, Liang
    BRIEFINGS IN BIOINFORMATICS, 2018, 19 (01) : 109 - 117
  • [43] Analysis of differential gene expression by RNA-seq data in brain areas of laboratory animals
    Babenko, Vladimir N.
    Bragin, Anatoly O.
    Spitsina, Anastasia M.
    Chadaeva, Irina V.
    Galieva, Elvira R.
    Orlova, Galina V.
    Medvedeva, Irina V.
    Orlov, Yuriy L.
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2016, 13 (04) : 292
  • [44] Enhanced clustering-based differential expression analysis method for RNA-seq data
    Makino, Manon
    Shimizu, Kentaro
    Kadota, Koji
    METHODSX, 2024, 12
  • [45] Differential Expression Analysis on RNA-Seq Count Data Based on Penalized Matrix Decomposition
    Liu, Jin-Xing
    Gao, Ying-Lian
    Xu, Yong
    Zheng, Chun-Hou
    You, Jane
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2014, 13 (01) : 12 - 18
  • [46] iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data
    Steven Xijin Ge
    Eun Wo Son
    Runan Yao
    BMC Bioinformatics, 19
  • [47] A Semi-parametric Bayesian Approach for Differential Expression Analysis of RNA-seq Data
    Fangfang Liu
    Chong Wang
    Peng Liu
    Journal of Agricultural, Biological, and Environmental Statistics, 2015, 20 : 555 - 576
  • [48] Differential expression analysis of human endogenous retroviruses based on ENCODE RNA-seq data
    Kerstin Haase
    Anja Mösch
    Dmitrij Frishman
    BMC Medical Genomics, 8
  • [49] Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing
    Wang, Xi
    Cairns, Murray J.
    BMC BIOINFORMATICS, 2013, 14
  • [50] Erratum to: Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
    Franck Rapaport
    Raya Khanin
    Yupu Liang
    Mono Pirun
    Azra Krek
    Paul Zumbo
    Christopher E. Mason
    Nicholas D. Socci
    Doron Betel
    Genome Biology, 16