Evaluation of tools for long read RNA-seq splice-aware alignment

被引:44
|
作者
Krizanovic, Kresimir [1 ]
Echchiki, Amina [2 ,3 ]
Roux, Julien [2 ,3 ,5 ]
Sikic, Mile [1 ,4 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Dept Elect Syst & Informat Proc, Zagreb 10000, Croatia
[2] Univ Lausanne, Dept Ecol & Evolut, CH-1015 Lausanne, Switzerland
[3] Swiss Inst Bioinformat, CH-1015 Lausanne, Switzerland
[4] Bioinformat Inst, Singapore 138671, Singapore
[5] Univ Hosp Basel, Dept Biomed, CH-4031 Basel, Switzerland
关键词
TRANSCRIPTOME; ALIGNER; HYBRID;
D O I
10.1093/bioinformatics/btx668
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads opens up new possibilities. However, the high error rates common to these technologies set new bioinformatics challenges for the gapped alignment of reads to their genomic origin. In this study, we have explored how currently available RNA-seq splice-aware alignment tools cope with increased read lengths and error rates. All tested tools were initially developed for short NGS reads, but some have claimed support for long Pacific Biosciences (PacBio) or even Oxford Nanopore Technologies (ONT) MinION reads. The tools were tested on synthetic and real datasets from two technologies (PacBio and ONT MinION). Alignment quality and resource usage were compared across different aligners. The effect of error correction of long reads was explored, both using self-correction and correction with an external short reads dataset. A tool was developed for evaluating RNA-seq alignment results. This tool can be used to compare the alignment of simulated reads to their genomic origin, or to compare the alignment of real reads to a set of annotated transcripts. Our tests show that while some RNA-seq aligners were unable to cope with long error-prone reads, others produced overall good results. We further show that alignment accuracy can be improved using error-corrected reads. https://figshare.com/projects/RNAseq_benchmark/24391
引用
收藏
页码:748 / 754
页数:7
相关论文
共 50 条
  • [41] Development and evaluation of RNA-seq methods
    Levin, Joshua
    Adiconis, Xian
    Yassour, Moran
    Thompson, Dawn
    Guttman, Mitchell
    Berger, Michael
    Fan, Lin
    Friedman, Nir
    Nusbaum, Chad
    Gnirke, Andreas
    Regev, Aviv
    GENOME BIOLOGY, 2010, 11
  • [42] Evaluation and application of RNA-Seq by MinION
    Seki, Masahide
    Katsumata, Eri
    Suzuki, Ayako
    Sereewattanawoot, Sarun
    Sakamoto, Yoshitaka
    Mizushima-Sugano, Junko
    Sugano, Sumio
    Kohno, Takashi
    Frith, Martin C.
    Tsuchihara, Katsuya
    Suzuki, Yutaka
    DNA RESEARCH, 2019, 26 (01) : 55 - 65
  • [43] Development and evaluation of RNA-seq methods
    Joshua Levin
    Xian Adiconis
    Moran Yassour
    Dawn Thompson
    Mitchell Guttman
    Michael Berger
    Lin Fan
    Nir Friedman
    Chad Nusbaum
    Andreas Gnirke
    Aviv Regev
    Genome Biology, 11 (Suppl 1)
  • [44] Splice_sim: a nucleotide conversion-enabled RNA-seq simulation and evaluation framework
    Popitsch, Niko
    Neumann, Tobias
    von Haeseler, Arndt
    Ameres, Stefan L.
    GENOME BIOLOGY, 2024, 25 (01):
  • [45] dsRID: in silico identification of dsRNA regions using long-read RNA-seq data
    Yamamoto, Ryo
    Liu, Zhiheng
    Choudhury, Mudra
    Xiao, Xinshu
    BIOINFORMATICS, 2023, 39 (11)
  • [46] RNA-Seq gene expression estimation with read mapping uncertainty
    Li, Bo
    Ruotti, Victor
    Stewart, Ron M.
    Thomson, James A.
    Dewey, Colin N.
    BIOINFORMATICS, 2010, 26 (04) : 493 - 500
  • [47] Transcriptome assembly from long-read RNA-seq alignments with StringTie2
    Kovaka, Sam
    Zimin, Aleksey, V
    Pertea, Geo M.
    Razaghi, Roham
    Salzberg, Steven L.
    Pertea, Mihaela
    GENOME BIOLOGY, 2019, 20 (01)
  • [48] Modelling RNA-Seq Read Counts by Grey Relational Analysis
    Thanh Nguyen
    Nahavandi, Saeid
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 4293 - 4298
  • [49] Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive
    Nellore, Abhinav
    Jaffe, Andrew E.
    Fortin, Jean-Philippe
    Alquicira-Hernandez, Jose
    Collado-Torres, Leonardo
    Wang, Siruo
    Phillips, Robert A., III
    Karbhari, Nishika
    Hansen, Kasper D.
    Langmead, Ben
    Leek, Jeffrey T.
    GENOME BIOLOGY, 2016, 17
  • [50] Long-read RNA-seq identifies allelic loss and aberrant splicing in cancer genes
    Schwenk, Vincent
    Scharf, Florentine
    Silva, Rafaela M. Leal
    Morak, Monika
    Steinke-Lange, Verena
    Holinski-Feder, Elke
    Pickl, Julia M. A.
    Wolf, Dieter A.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 408 - 408