TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

被引:28
|
作者
Nariai, Naoki [1 ]
Kojima, Kaname [1 ]
Mimori, Takahiro [1 ]
Sato, Yukuto [1 ]
Kawai, Yosuke [1 ]
Yamaguchi-Kabata, Yumi [1 ]
Nagasaki, Masao [1 ]
机构
[1] Tohoku Univ, Tohoku Med Megabank Org, Dept Integrat Genom, Aoba Ku, Sendai, Miyagi 9808573, Japan
来源
BMC GENOMICS | 2014年 / 15卷
关键词
REFERENCE GENOME; ALIGNMENT; GENE; QUANTIFICATION; REVEALS;
D O I
10.1186/1471-2164-15-S10-S5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). Results: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. Conclusions: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] TRIP: a method for novel transcript reconstruction from paired-end RNA-seq reads
    Serghei Mangul
    Adrian Caciula
    Dumitru Brinza
    Ion I Mandoiu
    Alex Zelikovsky
    BMC Bioinformatics, 13 (Suppl 18)
  • [32] Bayesian estimation of differential transcript usage from RNA-seq data
    Papastamoulis, Panagiotis
    Rattray, Magnus
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2017, 16 (5-6) : 387 - 405
  • [33] MSIQ: JOINT MODELING OF MULTIPLE RNA-SEQ SAMPLES FOR ACCURATE ISOFORM QUANTIFICATION
    Li, Wei Vivian
    Zhao, Anqi
    Zhang, Shihua
    Li, Jingyi Jessica
    ANNALS OF APPLIED STATISTICS, 2018, 12 (01): : 510 - 539
  • [34] Estimation of alternative splicing isoform frequencies from RNA-Seq data
    Marius Nicolae
    Serghei Mangul
    Ion I Măndoiu
    Alex Zelikovsky
    Algorithms for Molecular Biology, 6
  • [35] Estimation of Alternative Splicing isoform Frequencies from RNA-Seq Data
    Nicolae, Marius
    Mangul, Serghei
    Mandoiu, Ion
    Zelikovsky, Alex
    ALGORITHMS IN BIOINFORMATICS, 2010, 6293 : 202 - +
  • [36] Estimation of alternative splicing isoform frequencies from RNA-Seq data
    Nicolae, Marius
    Mangul, Serghei
    Mandoiu, Ion I.
    Zelikovsky, Alex
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2011, 6
  • [37] Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling
    Labaj, Pawel P.
    Leparc, German G.
    Linggi, Bryan E.
    Markillie, Lye Meng
    Wiley, H. Steven
    Kreil, David P.
    BIOINFORMATICS, 2011, 27 (13) : I383 - I391
  • [38] Identification and visualization of differential isoform expression in RNA-seq time series
    Nueda, Maria Jose
    Martorell-Marugan, Jordi
    Marti, Cristina
    Tarazona, Sonia
    Conesa, Ana
    BIOINFORMATICS, 2018, 34 (03) : 524 - 526
  • [39] Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data
    Alexander Kanitz
    Foivos Gypas
    Andreas J. Gruber
    Andreas R. Gruber
    Georges Martin
    Mihaela Zavolan
    Genome Biology, 16
  • [40] Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data
    Kanitz, Alexander
    Gypas, Foivos
    Gruber, Andreas J.
    Gruber, Andreas R.
    Martin, Georges
    Zavolan, Mihaela
    GENOME BIOLOGY, 2015, 16