TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

被引:28
|
作者
Nariai, Naoki [1 ]
Kojima, Kaname [1 ]
Mimori, Takahiro [1 ]
Sato, Yukuto [1 ]
Kawai, Yosuke [1 ]
Yamaguchi-Kabata, Yumi [1 ]
Nagasaki, Masao [1 ]
机构
[1] Tohoku Univ, Tohoku Med Megabank Org, Dept Integrat Genom, Aoba Ku, Sendai, Miyagi 9808573, Japan
来源
BMC GENOMICS | 2014年 / 15卷
关键词
REFERENCE GENOME; ALIGNMENT; GENE; QUANTIFICATION; REVEALS;
D O I
10.1186/1471-2164-15-S10-S5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). Results: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. Conclusions: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
    Naoki Nariai
    Kaname Kojima
    Takahiro Mimori
    Yukuto Sato
    Yosuke Kawai
    Yumi Yamaguchi-Kabata
    Masao Nagasaki
    BMC Genomics, 15
  • [2] TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference
    Nariai, Naoki
    Hirose, Osamu
    Kojima, Kaname
    Nagasaki, Masao
    BIOINFORMATICS, 2013, 29 (18) : 2292 - 2299
  • [3] Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads
    Li, Wei
    Jiang, Tao
    BIOINFORMATICS, 2012, 28 (22) : 2914 - 2921
  • [4] Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads
    Ernest Turro
    Shu-Yi Su
    Ângela Gonçalves
    Lachlan JM Coin
    Sylvia Richardson
    Alex Lewin
    Genome Biology, 12
  • [5] Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads
    Turro, Ernest
    Su, Shu-Yi
    Goncalves, Angela
    Coin, Lachlan J. M.
    Richardson, Sylvia
    Lewin, Alex
    GENOME BIOLOGY, 2011, 12 (02):
  • [6] CLASS: constrained transcript assembly of RNA-seq reads
    Song, Li
    Florea, Liliana
    BMC BIOINFORMATICS, 2013, 14
  • [7] CLASS: constrained transcript assembly of RNA-seq reads
    Li Song
    Liliana Florea
    BMC Bioinformatics, 14
  • [8] RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes
    Srivastava, Avi
    Sarkar, Hirak
    Gupta, Nitish
    Patro, Rob
    BIOINFORMATICS, 2016, 32 (12) : 192 - 200
  • [9] ISOFORM ABUNDANCE INFERENCE PROVIDES A MORE ACCURATE ESTIMATION OF GENE EXPRESSION LEVELS IN RNA-SEQ
    Wang, Xi
    Wu, Zhengpeng
    Zhang, Xuegong
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2010, 8 : 177 - 192
  • [10] Fast and accurate approximate inference of transcript expression from RNA-seq data
    Hensman, James
    Papastamoulis, Panagiotis
    Glaus, Peter
    Honkela, Antti
    Rattray, Magnus
    BIOINFORMATICS, 2015, 31 (24) : 3881 - 3889