Evaluation of tools for long read RNA-seq splice-aware alignment

被引:44
|
作者
Krizanovic, Kresimir [1 ]
Echchiki, Amina [2 ,3 ]
Roux, Julien [2 ,3 ,5 ]
Sikic, Mile [1 ,4 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Dept Elect Syst & Informat Proc, Zagreb 10000, Croatia
[2] Univ Lausanne, Dept Ecol & Evolut, CH-1015 Lausanne, Switzerland
[3] Swiss Inst Bioinformat, CH-1015 Lausanne, Switzerland
[4] Bioinformat Inst, Singapore 138671, Singapore
[5] Univ Hosp Basel, Dept Biomed, CH-4031 Basel, Switzerland
关键词
TRANSCRIPTOME; ALIGNER; HYBRID;
D O I
10.1093/bioinformatics/btx668
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing has transformed the study of gene expression levels through RNA-seq, a technique that is now routinely used by various fields, such as genetic research or diagnostics. The advent of third generation sequencing technologies providing significantly longer reads opens up new possibilities. However, the high error rates common to these technologies set new bioinformatics challenges for the gapped alignment of reads to their genomic origin. In this study, we have explored how currently available RNA-seq splice-aware alignment tools cope with increased read lengths and error rates. All tested tools were initially developed for short NGS reads, but some have claimed support for long Pacific Biosciences (PacBio) or even Oxford Nanopore Technologies (ONT) MinION reads. The tools were tested on synthetic and real datasets from two technologies (PacBio and ONT MinION). Alignment quality and resource usage were compared across different aligners. The effect of error correction of long reads was explored, both using self-correction and correction with an external short reads dataset. A tool was developed for evaluating RNA-seq alignment results. This tool can be used to compare the alignment of simulated reads to their genomic origin, or to compare the alignment of real reads to a set of annotated transcripts. Our tests show that while some RNA-seq aligners were unable to cope with long error-prone reads, others produced overall good results. We further show that alignment accuracy can be improved using error-corrected reads. https://figshare.com/projects/RNAseq_benchmark/24391
引用
收藏
页码:748 / 754
页数:7
相关论文
共 50 条
  • [31] Evaluation of Seven Different RNA-Seq Alignment Tools Based on Experimental Data from the Model Plant Arabidopsis thaliana
    Schaarschmidt, Stephanie
    Fischer, Axel
    Zuther, Ellen
    Hincha, Dirk K.
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (05)
  • [32] voom: precision weights unlock linear model analysis tools for RNA-seq read counts
    Charity W Law
    Yunshun Chen
    Wei Shi
    Gordon K Smyth
    Genome Biology, 15
  • [33] RNA-seq Read Simulator using SAM Template
    Lee, Sang-min
    Tak, Haesung
    Park, Kiejung
    Cho, Hwangue
    Lee, Dohoon
    2013 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2013,
  • [34] Rcount: simple and flexible RNA-Seq read counting
    Schmid, Marc W.
    Grossniklaus, Ueli
    BIOINFORMATICS, 2015, 31 (03) : 436 - 437
  • [35] NASA GeneLab RNA-seq consensus pipeline: standardized processing of short-read RNA-seq data
    Overbey, Eliah G.
    Saravia-Butler, Amanda M.
    Zhang, Zhe
    Rathi, Komal S.
    Fogle, Homer
    da Silveira, Willian A.
    Barker, Richard J.
    Bass, Joseph J.
    Beheshti, Afshin
    Berrios, Daniel C.
    Blaber, Elizabeth A.
    Cekanaviciute, Egle
    Costa, Helio A.
    Davin, Laurence B.
    Fisch, Kathleen M.
    Gebre, Samrawit G.
    Geniza, Matthew
    Gilbert, Rachel
    Gilroy, Simon
    Hardiman, Gary
    Herranz, Raul
    Kidane, Yared H.
    Kruse, Colin P. S.
    Lee, Michael D.
    Liefeld, Ted
    Lewis, Norman G.
    McDonald, J. Tyson
    Meller, Robert
    Mishra, Tejaswini
    Perera, Imara Y.
    Ray, Shayoni
    Reinsch, Sigrid S.
    Rosenthal, Sara Brin
    Strong, Michael
    Szewczyk, Nathaniel J.
    Tahimic, Candice G. T.
    Taylor, Deanne M.
    Vandenbrink, Joshua P.
    Villacampa, Alicia
    Weging, Silvio
    Wolverton, Chris
    Wyatt, Sarah E.
    Zea, Luis
    Costes, Sylvain, V
    Galazka, Jonathan M.
    ISCIENCE, 2021, 24 (04)
  • [36] Systematic assessment of long-read RNA-seq methods for transcript identification and quantification
    Pardo-Palacios, Francisco J.
    Wang, Dingjie
    Reese, Fairlie
    Diekhans, Mark
    Carbonell-Sala, Silvia
    Williams, Brian
    Loveland, Jane E.
    De Maria, Maite
    Adams, Matthew S.
    Balderrama-Gutierrez, Gabriela
    Behera, Amit K.
    Gonzalez Martinez, Jose M.
    Hunt, Toby
    Lagarde, Julien
    Liang, Cindy E.
    Li, Haoran
    Meade, Marcus Jerryd
    Moraga Amador, David A.
    Prjibelski, Andrey D.
    Birol, Inanc
    Bostan, Hamed
    Brooks, Ashley M.
    Celik, Muhammed Hasan
    Chen, Ying
    Du, Mei R. M.
    Felton, Colette
    Goeke, Jonathan
    Hafezqorani, Saber
    Herwig, Ralf
    Kawaji, Hideya
    Lee, Joseph
    Li, Jian-Liang
    Lienhard, Matthias
    Mikheenko, Alla
    Mulligan, Dennis
    Nip, Ka Ming
    Pertea, Mihaela
    Ritchie, Matthew E.
    Sim, Andre D.
    Tang, Alison D.
    Wan, Yuk Kei
    Wang, Changqing
    Wong, Brandon Y.
    Yang, Chen
    Barnes, If
    Berry, Andrew E.
    Capella-Gutierrez, Salvador
    Cousineau, Alyssa
    Dhillon, Namrita
    Fernandez-Gonzalez, Jose M.
    NATURE METHODS, 2024, 21 (07) : 1349 - 1363
  • [37] Prediction and Quantification of Splice Events from RNA-Seq Data
    Goldstein, Leonard D.
    Cao, Yi
    Pau, Gregoire
    Lawrence, Michael
    Wu, Thomas D.
    Seshagiri, Somasekar
    Gentleman, Robert
    PLOS ONE, 2016, 11 (05):
  • [38] Evaluation of the capacities of mouse TCR profiling from short read RNA-seq data
    Bai, Yu
    Wang, David
    Li, Wentian
    Huang, Ying
    Ye, Xuan
    Waite, Janelle
    Barry, Thomas
    Edelman, Kurt H.
    Levenkova, Natasha
    Guo, Chunguang
    Skokos, Dimitris
    Wei, Yi
    Macdonald, Lynn E.
    Fury, Wen
    PLOS ONE, 2018, 13 (11):
  • [39] OSA: a fast and accurate alignment tool for RNA-Seq
    Hu, Jun
    Ge, Huanying
    Newman, Matt
    Liu, Kejun
    BIOINFORMATICS, 2012, 28 (14) : 1933 - 1934
  • [40] Deep annotation of long noncoding RNAs by assembling RNA-seq and small RNA-seq data
    Zhang, Jiaming
    Hou, Weibo
    Zhao, Qi
    Xiao, Songling
    Linghu, Hongye
    Zhang, Lixin
    Du, Jiawei
    Cui, Hongdi
    Yang, Xu
    Ling, Shukuan
    Su, Jianzhong
    Kong, Qingran
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2023, 299 (09)