Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads

被引:2
|
作者
Han, Seong Woo [1 ]
Jewell, San [2 ]
Thomas-Tikhonenko, Andrei [3 ,4 ]
Barash, Yoseph [1 ,2 ]
机构
[1] Univ Penn, Sch Engn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[2] Univ Penn, Perelman Sch Med, Dept Genet, Philadelphia, PA 19104 USA
[3] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[4] Childrens Hosp Philadelphia, Div Canc Pathobiol, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1101/gr.278659.123
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Mapping transcriptomic variations using either short- or long-read RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, whereas short reads still provide improved coverage and error rates. Yet, open questions remain, such as how to quantitatively compare the technologies, can we combine them, and what is the benefit of such a combined view? We tackle these questions by first creating a pipeline to assess matched long- and short-read data using a variety of transcriptome statistics. We find that across data sets, algorithms, and technologies, matched short-read data detects similar to 30% more splice junctions, such that similar to 10%-30% of the splice junctions included at >= 20% by short reads are missed by long reads. In contrast, long reads detect many more intron-retention events and can detect full isoforms, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software, to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long-read technology or algorithm and can be combined with short-read data for improved transcriptome analysis.
引用
收藏
页码:1624 / 1635
页数:12
相关论文
共 50 条
  • [1] Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads
    Jon G. Sanders
    Sergey Nurk
    Rodolfo A. Salido
    Jeremiah Minich
    Zhenjiang Z. Xu
    Qiyun Zhu
    Cameron Martino
    Marcus Fedarko
    Timothy D. Arthur
    Feng Chen
    Brigid S. Boland
    Greg C. Humphrey
    Caitriona Brennan
    Karenina Sanders
    James Gaffney
    Kristen Jepsen
    Mahdieh Khosroheidari
    Cliff Green
    Marlon Liyanage
    Jason W. Dang
    Vanessa V. Phelan
    Robert A. Quinn
    Anton Bankevich
    John T. Chang
    Tariq M. Rana
    Douglas J. Conrad
    William J. Sandborn
    Larry Smarr
    Pieter C. Dorrestein
    Pavel A. Pevzner
    Rob Knight
    Genome Biology, 20
  • [2] Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads
    Sanders, Jon G.
    Nurk, Sergey
    Salido, Rodolfo A.
    Minich, Jeremiah
    Xu, Zhenjiang Z.
    Zhu, Qiyun
    Martino, Cameron
    Fedarko, Marcus
    Arthur, Timothy D.
    Chen, Feng
    Boland, Brigid S.
    Humphrey, Greg C.
    Brennan, Caitriona
    Sanders, Karenina
    Gaffney, James
    Jepsen, Kristen
    Khosroheidari, Mahdieh
    Green, Cliff
    Liyanage, Marlon
    Dang, Jason W.
    Phelan, Vanessa V.
    Quinn, Robert A.
    Bankevich, Anton
    Chang, John T.
    Rana, Tariq M.
    Conrad, Douglas J.
    Sandborn, William J.
    Smarr, Larry
    Dorrestein, Pieter C.
    Pevzner, Pavel A.
    Knight, Rob
    GENOME BIOLOGY, 2019, 20 (01) : 1 - 14
  • [3] Genome sequencing: Long reads for a short plant
    Elizabeth A. Kellogg
    Nature Plants, 1 (12)
  • [4] Streamlining Quantitative Analysis of Long RNA Sequencing Reads
    Oeck, Sebastian
    Tuns, Alicia I.
    Hurst, Sebastian
    Schramm, Alexander
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (19) : 1 - 8
  • [5] Accurate spliced alignment of long RNA sequencing reads
    Sahlin, Kristoffer
    Makinen, Veli
    BIOINFORMATICS, 2021, 37 (24) : 4643 - 4651
  • [6] Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human
    Kuo, Richard I.
    Tseng, Elizabeth
    Eory, Lel
    Paton, Ian R.
    Archibald, Alan L.
    Burt, David W.
    BMC GENOMICS, 2017, 18
  • [7] Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human
    Richard I. Kuo
    Elizabeth Tseng
    Lel Eory
    Ian R. Paton
    Alan L. Archibald
    David W. Burt
    BMC Genomics, 18
  • [8] Combining single-cell short and long-reads sequencing for dissecting immune cell states
    Fuentes-Trillo, Azahara
    De Luca, Mariacristina
    Bolognini, Davide
    Seffin, Luca
    Basso-Ricci, Luca
    Aiuti, Alessandro
    Scala, Serena
    Dominguez, Cecilia
    EUROPEAN JOURNAL OF IMMUNOLOGY, 2024, 54 : 1192 - 1192
  • [9] Complexity of mammalian transcriptome analyzed by RNA deep sequencing
    Division of Genomic Technologies, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa
    230-0045, Japan
    不详
    DK-2450, Denmark
    Long Noncoding RNAs Structures and Functions, (3-22):
  • [10] Long-read RNA sequencing: A transformative technology for exploring transcriptome complexity in human diseases
    Ament, Isabelle Heifetz
    Debruyne, Nicole
    Wang, Feng
    Lin, Lan
    MOLECULAR THERAPY, 2025, 33 (03) : 883 - 894