Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq

被引:76
|
作者
Wu, Zhengpeng
Wang, Xi
Zhang, Xuegong [1 ]
机构
[1] Tsinghua Univ, TNLIST Dept Automat, MOE Key Lab Bioinformat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
MESSENGER-RNA; TRANSCRIPTOME; DISEASE; PARKIN; CHIP;
D O I
10.1093/bioinformatics/btq696
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: RNA-Seq technology based on next-generation sequencing provides the unprecedented ability of studying transcriptomes at high resolution and accuracy, and the potential of measuring expression of multiple isoforms from the same gene at high precision. Solved by maximum likelihood estimation, isoform expression can be inferred in RNA-Seq using statistical models based on the assumption that sequenced reads are distributed uniformly along transcripts. Modification of the model is needed when considering situations where RNA-Seq data do not follow uniform distribution. Results: We proposed two curves, the global bias curve (GBC) and the local bias curves (LBCs), to describe the non-uniformity of read distributions for all genes in a transcriptome and for each gene, respectively. Incorporating the bias curves into the uniform read distribution (URD) model, we introduced non-URD (N-URD) models to infer isoform expression levels. On a series of systematic simulation studies, the proposed models outperform the original model in recovering major isoforms and the expression ratio of alternative isoforms. We also applied the new model to real RNA-Seq datasets and found that its inferences on expression ratios of alternative isoforms are more reasonable. The experiments indicate that incorporating N-URD information can improve the accuracy in modeling and inferring isoform expression in RNA-Seq.
引用
收藏
页码:502 / 508
页数:7
相关论文
共 50 条
  • [31] Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads
    Li, Wei
    Jiang, Tao
    BIOINFORMATICS, 2012, 28 (22) : 2914 - 2921
  • [32] HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
    Kim, Hyunsung John
    Pourmand, Nader
    PLOS ONE, 2013, 8 (06):
  • [33] A model for isoform-level differential expression analysis using RNA-seq data without pre-specifying isoform structure
    Liu, Yang
    Wang, Junying
    Wu, Song
    Yang, Jie
    PLOS ONE, 2022, 17 (05):
  • [34] Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq
    Aguzzoli Heberle, Bernardo
    Brandon, J. Anthony
    Page, Madeline L.
    Nations, Kayla A.
    Dikobe, Ketsile I.
    White, Brendan J.
    Gordon, Lacey A.
    Fox, Grant A.
    Wadsworth, Mark E.
    Doyle, Patricia H.
    Williams, Brittney A.
    Fox, Edward J.
    Shantaraman, Anantharaman
    Ryten, Mina
    Goodwin, Sara
    Ghiban, Elena
    Wappel, Robert
    Mavruk-Eskipehlivan, Senem
    Miller, Justin B.
    Seyfried, Nicholas T.
    Nelson, Peter T.
    Fryer, John D.
    Ebbert, Mark T. W.
    NATURE BIOTECHNOLOGY, 2024, : 635 - 646
  • [35] rSeqDiff: Detecting Differential Isoform Expression from RNA-Seq Data Using Hierarchical Likelihood Ratio Test
    Shi, Yang
    Jiang, Hui
    PLOS ONE, 2013, 8 (11):
  • [36] Modeling non-uniformity in short-read rates in RNA-Seq data
    Jun Li
    Hui Jiang
    Wing Hung Wong
    Genome Biology, 11
  • [37] Modeling non-uniformity in short-read rates in RNA-Seq data
    Li, Jun
    Jiang, Hui
    Wong, Wing Hung
    GENOME BIOLOGY, 2010, 11 (05):
  • [38] scphaser: haplotype inference using single-cell RNA-seq data
    Edsgard, Daniel
    Reinius, Bjorn
    Sandberg, Rickard
    BIOINFORMATICS, 2016, 32 (19) : 3038 - 3040
  • [39] TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference
    Nariai, Naoki
    Hirose, Osamu
    Kojima, Kaname
    Nagasaki, Masao
    BIOINFORMATICS, 2013, 29 (18) : 2292 - 2299
  • [40] Quantification of mutant-allele expression at isoform level in cancer from RNA-seq data
    Deng, Wenjiang
    Mou, Tian
    Pawitan, Yudi
    Trung Nghia Vu
    NAR GENOMICS AND BIOINFORMATICS, 2022, 4 (03)