Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq

被引:76
|
作者
Wu, Zhengpeng
Wang, Xi
Zhang, Xuegong [1 ]
机构
[1] Tsinghua Univ, TNLIST Dept Automat, MOE Key Lab Bioinformat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
MESSENGER-RNA; TRANSCRIPTOME; DISEASE; PARKIN; CHIP;
D O I
10.1093/bioinformatics/btq696
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: RNA-Seq technology based on next-generation sequencing provides the unprecedented ability of studying transcriptomes at high resolution and accuracy, and the potential of measuring expression of multiple isoforms from the same gene at high precision. Solved by maximum likelihood estimation, isoform expression can be inferred in RNA-Seq using statistical models based on the assumption that sequenced reads are distributed uniformly along transcripts. Modification of the model is needed when considering situations where RNA-Seq data do not follow uniform distribution. Results: We proposed two curves, the global bias curve (GBC) and the local bias curves (LBCs), to describe the non-uniformity of read distributions for all genes in a transcriptome and for each gene, respectively. Incorporating the bias curves into the uniform read distribution (URD) model, we introduced non-URD (N-URD) models to infer isoform expression levels. On a series of systematic simulation studies, the proposed models outperform the original model in recovering major isoforms and the expression ratio of alternative isoforms. We also applied the new model to real RNA-Seq datasets and found that its inferences on expression ratios of alternative isoforms are more reasonable. The experiments indicate that incorporating N-URD information can improve the accuracy in modeling and inferring isoform expression in RNA-Seq.
引用
收藏
页码:502 / 508
页数:7
相关论文
共 50 条
  • [1] PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution
    Hu, Yu
    Liu, Yichuan
    Mao, Xianyun
    Jia, Cheng
    Ferguson, Jane F.
    Xue, Chenyi
    Reilly, Muredach P.
    Li, Hongzhe
    Li, Mingyao
    NUCLEIC ACIDS RESEARCH, 2014, 42 (03)
  • [2] PDEGEM: Modeling non-uniform read distribution in RNA-Seq data
    Xia, Yuchao
    Wang, Fugui
    Qian, Minping
    Qin, Zhaohui
    Deng, Minghua
    BMC MEDICAL GENOMICS, 2015, 8
  • [3] PDEGEM: Modeling non-uniform read distribution in RNA-Seq data
    Yuchao Xia
    Fugui Wang
    Minping Qian
    Zhaohui Qin
    Minghua Deng
    BMC Medical Genomics, 8
  • [4] Joint estimation of isoform expression and isoform-specific read distribution using multisample RNA-Seq data
    Suo, Chen
    Calza, Stefano
    Salim, Agus
    Pawitan, Yudi
    BIOINFORMATICS, 2014, 30 (04) : 506 - 513
  • [5] NURD: an implementation of a new method to estimate isoform expression from non-uniform RNA-seq data
    Xinyun Ma
    Xuegong Zhang
    BMC Bioinformatics, 14
  • [6] NURD: an implementation of a new method to estimate isoform expression from non-uniform RNA-seq data
    Ma, Xinyun
    Zhang, Xuegong
    BMC BIOINFORMATICS, 2013, 14
  • [7] Statistical inferences for isoform expression in RNA-Seq
    Jiang, Hui
    Wong, Wing Hung
    BIOINFORMATICS, 2009, 25 (08) : 1026 - 1032
  • [8] Estimation of Isoform Expression using Hierarchical Bayesian Model by RNA-seq
    Wang, Zengmiao
    Wang, Jun
    Deng, Minghua
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 8554 - 8558
  • [9] ISOFORM ABUNDANCE INFERENCE PROVIDES A MORE ACCURATE ESTIMATION OF GENE EXPRESSION LEVELS IN RNA-SEQ
    Wang, Xi
    Wu, Zhengpeng
    Zhang, Xuegong
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2010, 8 : 177 - 192
  • [10] DELongSeq for efficient detection of differential isoform expression from long-read RNA-seq data
    Hu, Yu
    Gouru, Anagha
    Wang, Kai
    NAR GENOMICS AND BIOINFORMATICS, 2023, 5 (01)