Evaluation of tools for long read RNA-seq splice-aware alignment

被引:44
|
作者
Krizanovic, Kresimir [1 ]
Echchiki, Amina [2 ,3 ]
Roux, Julien [2 ,3 ,5 ]
Sikic, Mile [1 ,4 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Dept Elect Syst & Informat Proc, Zagreb 10000, Croatia
[2] Univ Lausanne, Dept Ecol & Evolut, CH-1015 Lausanne, Switzerland
[3] Swiss Inst Bioinformat, CH-1015 Lausanne, Switzerland
[4] Bioinformat Inst, Singapore 138671, Singapore
[5] Univ Hosp Basel, Dept Biomed, CH-4031 Basel, Switzerland
关键词
TRANSCRIPTOME; ALIGNER; HYBRID;
D O I
10.1093/bioinformatics/btx668
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads opens up new possibilities. However, the high error rates common to these technologies set new bioinformatics challenges for the gapped alignment of reads to their genomic origin. In this study, we have explored how currently available RNA-seq splice-aware alignment tools cope with increased read lengths and error rates. All tested tools were initially developed for short NGS reads, but some have claimed support for long Pacific Biosciences (PacBio) or even Oxford Nanopore Technologies (ONT) MinION reads. The tools were tested on synthetic and real datasets from two technologies (PacBio and ONT MinION). Alignment quality and resource usage were compared across different aligners. The effect of error correction of long reads was explored, both using self-correction and correction with an external short reads dataset. A tool was developed for evaluating RNA-seq alignment results. This tool can be used to compare the alignment of simulated reads to their genomic origin, or to compare the alignment of real reads to a set of annotated transcripts. Our tests show that while some RNA-seq aligners were unable to cope with long error-prone reads, others produced overall good results. We further show that alignment accuracy can be improved using error-corrected reads. https://figshare.com/projects/RNAseq_benchmark/24391
引用
收藏
页码:748 / 754
页数:7
相关论文
共 50 条
  • [1] ASimulatoR: splice-aware RNA-Seq data simulation
    Manz, Quirin
    Tsoy, Olga
    Fenn, Amit
    Baumbach, Jan
    Voelker, Uwe
    List, Markus
    Kacprowski, Tim
    BIOINFORMATICS, 2021, 37 (18) : 3008 - 3010
  • [2] RNA variant identification discrepancy among splice-aware alignment algorithms
    Hong, Ji Hyung
    Ko, Yoon Ho
    Kang, Keunsoo
    PLOS ONE, 2018, 13 (08):
  • [3] Unbiased comparison of alignment tools for splice junction detection from RNA-Seq data
    Gatto, Alberto
    Sanchez-Cabo, Fatima
    Torroja, Carlos
    Lara, Enrique
    PROCEEDINGS IWBBIO 2013: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, 2013, : 27 - 27
  • [4] Splice-Aware Multiple Sequence Alignment of Protein Isoforms
    Nord, Alex
    Hornbeck, Peter
    Carey, Kaitlin
    Wheeler, Travis
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 200 - 210
  • [5] Comparisons and performance evaluations of RNA-seq alignment tools
    Wang, Wei-An
    Tsai, Mong-Hsun
    Wu, Chin-Ting
    Lai, Liang-Chuan
    Lu, Tzu-Pin
    Chuang, Eric Y.
    2014 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICEECS), 2014, : 215 - 218
  • [6] CircMiner: accurate and rapid detection of circular RNA through splice-aware pseudo-alignment scheme
    Asghari, Hossein
    Lin, Yen-Yi
    Xu, Yang
    Haghshenas, Ehsan
    Collins, Colin C.
    Hach, Faraz
    BIOINFORMATICS, 2020, 36 (12) : 3703 - 3711
  • [7] A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
    Lindner, Robert
    Friedel, Caroline C.
    PLOS ONE, 2012, 7 (12):
  • [8] Context-aware transcript quantification from long-read RNA-seq data with Bambu
    Chen, Ying
    Sim, Andre
    Wan, Yuk Kei
    Yeo, Keith
    Lee, Joseph Jing Xian
    Ling, Min Hao
    Love, Michael I.
    Goke, Jonathan
    NATURE METHODS, 2023, 20 (08) : 1187 - +
  • [9] Context-aware transcript quantification from long-read RNA-seq data with Bambu
    Ying Chen
    Andre Sim
    Yuk Kei Wan
    Keith Yeo
    Joseph Jing Xian Lee
    Min Hao Ling
    Michael I. Love
    Jonathan Göke
    Nature Methods, 2023, 20 : 1187 - 1195
  • [10] Limitations of alignment-free tools in total RNA-seq quantification
    Douglas C. Wu
    Jun Yao
    Kevin S. Ho
    Alan M. Lambowitz
    Claus O. Wilke
    BMC Genomics, 19