Estimation of alternative splicing isoform frequencies from RNA-Seq data

被引:93
|
作者
Nicolae, Marius [1 ]
Mangul, Serghei [2 ]
Mandoiu, Ion I. [1 ]
Zelikovsky, Alex [2 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
[2] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
来源
基金
美国国家科学基金会;
关键词
SHORT SEQUENCE READS; EXPRESSION LEVELS; GENE-EXPRESSION; TRANSCRIPTOME; QUANTIFICATION; RECONSTRUCTION; REVEALS; GENOME;
D O I
10.1186/1748-7188-6-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging. Results: In this paper we present a novel expectation-maximization algorithm for inference of isoform-and gene-specific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at http://dna.engr.uconn.edu/software/IsoEM/. Conclusions: Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels of annotated isoforms and genes.
引用
下载
收藏
页数:13
相关论文
共 50 条
  • [1] Estimation of alternative splicing isoform frequencies from RNA-Seq data
    Marius Nicolae
    Serghei Mangul
    Ion I Măndoiu
    Alex Zelikovsky
    Algorithms for Molecular Biology, 6
  • [2] Estimation of Alternative Splicing isoform Frequencies from RNA-Seq Data
    Nicolae, Marius
    Mangul, Serghei
    Mandoiu, Ion
    Zelikovsky, Alex
    ALGORITHMS IN BIOINFORMATICS, 2010, 6293 : 202 - +
  • [3] Modeling Alternative Splicing Variants from RNA-Seq Data with Isoform Graphs
    Beretta, Stefano
    Bonizzoni, Paola
    Della Vedova, Gianluca
    Pirola, Yuri
    Rizzi, Raffaella
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (01) : 16 - 40
  • [4] Intron-centric estimation of alternative splicing from RNA-seq data
    Pervouchine, Dmitri D.
    Knowles, David G.
    Guigo, Roderic
    BIOINFORMATICS, 2013, 29 (02) : 273 - 274
  • [5] Identification of Alternative Splicing and Polyadenylation in RNA-seq Data
    Dixit, Gunjan
    Zheng, Ying
    Parker, Brian
    Wen, Jiayu
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2021, (172):
  • [6] Statistical modeling of isoform splicing dynamics from RNA-seq time series data
    Huang, Yuanhua
    Sanguinetti, Guido
    BIOINFORMATICS, 2016, 32 (19) : 2965 - 2972
  • [7] Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs
    LeGault, Laura H.
    Dewey, Colin N.
    BIOINFORMATICS, 2013, 29 (18) : 2300 - 2310
  • [8] Detection, annotation and visualization of alternative splicing from RNA-Seq data with SplicingViewer
    Liu, Qi
    Chen, Chong
    Shen, Enjian
    Zhao, Fangqing
    Sun, Zhongsheng
    Wu, Jinyu
    GENOMICS, 2012, 99 (03) : 178 - 182
  • [9] iReckon: Simultaneous isoform discovery and abundance estimation from RNA-seq data
    Mezlini, Aziz M.
    Smith, Eric J. M.
    Fiume, Marc
    Buske, Orion
    Savich, Gleb L.
    Shah, Sohrab
    Aparicio, Sam
    Chiang, Derek Y.
    Goldenberg, Anna
    Brudno, Michael
    GENOME RESEARCH, 2013, 23 (03) : 519 - 529
  • [10] Alternative splicing, RNA-seq and drug discovery
    Zhao, Shanrong
    DRUG DISCOVERY TODAY, 2019, 24 (06) : 1258 - 1267