Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads

被引:2
|
作者
Han, Seong Woo [1 ]
Jewell, San [2 ]
Thomas-Tikhonenko, Andrei [3 ,4 ]
Barash, Yoseph [1 ,2 ]
机构
[1] Univ Penn, Sch Engn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[2] Univ Penn, Perelman Sch Med, Dept Genet, Philadelphia, PA 19104 USA
[3] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[4] Childrens Hosp Philadelphia, Div Canc Pathobiol, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1101/gr.278659.123
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mapping transcriptomic variations using either short- or long-read RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, whereas short reads still provide improved coverage and error rates. Yet, open questions remain, such as how to quantitatively compare the technologies, can we combine them, and what is the benefit of such a combined view? We tackle these questions by first creating a pipeline to assess matched long- and short-read data using a variety of transcriptome statistics. We find that across data sets, algorithms, and technologies, matched short-read data detects similar to 30% more splice junctions, such that similar to 10%-30% of the splice junctions included at >= 20% by short reads are missed by long reads. In contrast, long reads detect many more intron-retention events and can detect full isoforms, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software, to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long-read technology or algorithm and can be combined with short-read data for improved transcriptome analysis.
引用
收藏
页码:1624 / 1635
页数:12
相关论文
共 50 条
  • [21] Evaluation of structural variants calling performances using short and long reads sequencing
    Andrioletti, Valentina
    De Paoli, Federica
    Limongelli, Ivan
    Zucca, Susanna
    Rizzo, Ettore
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1630 - 1630
  • [22] Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
    Wick, Ryan R.
    Judd, Louise M.
    Gorrie, Claire L.
    Holt, Kathryn E.
    PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (06)
  • [23] High precision genome sequencing of engineered Gluconobacter oxydans 621H by combining long nanopore and short accurate Illumina reads
    Kranz, Angela
    Vogel, Alexander
    Degner, Ursula
    Kiefler, Ines
    Bott, Michael
    Usadel, Bjoern
    Polen, Tino
    JOURNAL OF BIOTECHNOLOGY, 2017, 258 : 197 - 205
  • [24] Analysis of Transcriptome Complexity Through RNA Sequencing in Normal and Failing Murine Hearts
    Lee, Jae-Hyung
    Gao, Chen
    Peng, Guangdun
    Greer, Christopher
    Ren, Shuxun
    Wang, Yibin
    Xiao, Xinshu
    CIRCULATION RESEARCH, 2011, 109 (12) : 1332 - 1341
  • [25] SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads
    Xie, Yinlong
    Wu, Gengxiong
    Tang, Jingbo
    Luo, Ruibang
    Patterson, Jordan
    Liu, Shanlin
    Huang, Weihua
    He, Guangzhu
    Gu, Shengchang
    Li, Shengkang
    Zhou, Xin
    Lam, Tak-Wah
    Li, Yingrui
    Xu, Xun
    Wong, Gane Ka-Shu
    Wang, Jun
    BIOINFORMATICS, 2014, 30 (12) : 1660 - 1666
  • [26] Temporal salt stress-induced transcriptome alterations and regulatory mechanisms revealed by PacBio long-reads RNA sequencing in Gossypium hirsutum
    Wang, Delong
    Lu, Xuke
    Chen, Xiugui
    Wang, Shuai
    Wang, Junjuan
    Guo, Lixue
    Yin, Zujun
    Chen, Quanjia
    Ye, Wuwei
    BMC GENOMICS, 2020, 21 (01)
  • [27] Temporal salt stress-induced transcriptome alterations and regulatory mechanisms revealed by PacBio long-reads RNA sequencing in Gossypium hirsutum
    Delong Wang
    Xuke Lu
    Xiugui Chen
    Shuai Wang
    Junjuan Wang
    Lixue Guo
    Zujun Yin
    Quanjia Chen
    Wuwei Ye
    BMC Genomics, 21
  • [28] Whole Genome Complete Resequencing of Bacillus subtilis Natto by Combining Long Reads with High-Quality Short Reads
    Kamada, Mayumi
    Hase, Sumitaka
    Sato, Kengo
    Toyoda, Atsushi
    Fujiyama, Asao
    Sakakibara, Yasubumi
    PLOS ONE, 2014, 9 (10):
  • [29] MetaSVs: A pipeline combining long and short reads for analysis and visualization of structural variants in metagenomes
    Li, Yuejuan
    Cao, Jiabao
    Wang, Jun
    IMETA, 2023, 2 (04):
  • [30] Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
    Julianne K. David
    Sean K. Maden
    Mary A. Wood
    Reid F. Thompson
    Abhinav Nellore
    Genome Biology, 23