Estimation of alternative splicing isoform frequencies from RNA-Seq data

被引:93
|
作者
Nicolae, Marius [1 ]
Mangul, Serghei [2 ]
Mandoiu, Ion I. [1 ]
Zelikovsky, Alex [2 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
[2] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
来源
基金
美国国家科学基金会;
关键词
SHORT SEQUENCE READS; EXPRESSION LEVELS; GENE-EXPRESSION; TRANSCRIPTOME; QUANTIFICATION; RECONSTRUCTION; REVEALS; GENOME;
D O I
10.1186/1748-7188-6-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging. Results: In this paper we present a novel expectation-maximization algorithm for inference of isoform-and gene-specific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at http://dna.engr.uconn.edu/software/IsoEM/. Conclusions: Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels of annotated isoforms and genes.
引用
下载
收藏
页数:13
相关论文
共 50 条
  • [31] Opportunities and methods for studying alternative splicing in cancer with RNA-Seq
    Feng, Huijuan
    Qin, Zhiyi
    Zhang, Xuegong
    CANCER LETTERS, 2013, 340 (02) : 179 - 191
  • [32] PathwaySplice: an R package for unbiased pathway analysis of alternative splicing in RNA-Seq data
    Yan, Aimin
    Ban, Yuguang
    Gao, Zhen
    Chen, Xi
    Wang, Lily
    BIOINFORMATICS, 2018, 34 (18) : 3220 - 3222
  • [33] A Statistical Method for the Detection of Alternative Splicing Using RNA-Seq
    Wang, Liguo
    Xi, Yuanxin
    Yu, Jun
    Dong, Liping
    Yen, Laising
    Li, Wei
    PLOS ONE, 2010, 5 (01):
  • [34] Quantification of co-transcriptional splicing from RNA-Seq data
    Herzel, Lydia
    Neugebauer, Karla M.
    METHODS, 2015, 85 : 36 - 43
  • [35] ARH-seq: identification of differential splicing in RNA-seq data
    Rasche, Axel
    Lienhard, Matthias
    Yaspo, Marie-Laure
    Lehrach, Hans
    Herwig, Ralf
    NUCLEIC ACIDS RESEARCH, 2014, 42 (14) : e110
  • [36] Computational analysis of alternative polyadenylation from standard RNA-seq and single-cell RNA-seq data
    Gao, Yipeng
    Li, Wei
    MRNA 3' END PROCESSING AND METABOLISM, 2021, 655 : 225 - 243
  • [37] Efficient RNA isoform identification and quantification from RNA-Seq data with network flows
    Bernard, Elsa
    Jacob, Laurent
    Mairal, Julien
    Vert, Jean-Philippe
    BIOINFORMATICS, 2014, 30 (17) : 2447 - 2455
  • [38] A novel robust statistical method for isoform quantification from RNA-seq data
    Mondal, Pronoy K.
    Chatterjee, Raghunath
    Mukhopadhyay, Indranil
    GENETIC EPIDEMIOLOGY, 2018, 42 (07) : 719 - 719
  • [39] Joint estimation of isoform expression and isoform-specific read distribution using multisample RNA-Seq data
    Suo, Chen
    Calza, Stefano
    Salim, Agus
    Pawitan, Yudi
    BIOINFORMATICS, 2014, 30 (04) : 506 - 513
  • [40] SAW: A Method to Identify Splicing Events from RNA-Seq Data Based on Splicing Fingerprints
    Ning, Kang
    Fermin, Damian
    PLOS ONE, 2010, 5 (08):