Overlap-Based Genome Assembly from Variable-Length Reads

被引:0
|
作者
Hui, Joseph [1 ]
Shomorony, Ilan [1 ]
Ramchandran, Kannan [1 ]
Courtade, Thomas A. [1 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently developed high-throughput sequencing platforms can generate very long reads, making the perfect assembly of whole genomes information-theoretically possible [1]. One of the challenges in achieving this goal in practice, however, is that traditional assembly algorithms based on the de Bruijn graph framework cannot handle the high error rates of long-read technologies. On the other hand, overlap-based approaches such as string graphs [2] are very robust to errors, but cannot achieve the theoretical lower bounds. In particular, these methods handle the variable-length reads provided by long-read technologies in a suboptimal manner. In this work, we introduce a new assembly algorithm with two desirable features in the context of long-read sequencing: (1) it is an overlap-based method, thus being more resilient to read errors than de Bruijn graph approaches; and (2) it achieves the information-theoretic bounds even in the variable-length read setting.
引用
收藏
页码:1018 / 1022
页数:5
相关论文
共 50 条
  • [41] A new model of variable-length coupled pendulums: from hyperchaos to superintegrability
    Szuminski, Wojciech
    NONLINEAR DYNAMICS, 2024, 112 (06) : 4027 - 4062
  • [42] A new strategy for better genome assembly from very short reads
    Ji, Yan
    Shi, Yixiang
    Ding, Guohui
    Li, Yixue
    BMC BIOINFORMATICS, 2011, 12
  • [43] A new model of variable-length coupled pendulums: from hyperchaos to superintegrability
    Wojciech Szumiński
    Nonlinear Dynamics, 2024, 112 : 4117 - 4145
  • [44] Multiple Sequence Assembly from Reads Alignable to a Common Reference Genome
    Peng, Qian
    Smith, Andrew D.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (05) : 1283 - 1295
  • [45] A new strategy for better genome assembly from very short reads
    Yan Ji
    Yixiang Shi
    Guohui Ding
    Yixue Li
    BMC Bioinformatics, 12
  • [46] Overlap detection for a genome assembly based on genomic signal processing
    Jugas, Robin
    Sedlar, Karel
    Vitek, Martin
    Skutkova, Helena
    2017 IEEE 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2017, : 300 - 305
  • [47] Variable-Length Speaker Conditioning in Flow-Based Text-to-Speech
    Choi, Byoung Jin
    Jeong, Myeonghun
    Kim, Minchan
    Kim, Nam Soo
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 899 - 903
  • [49] Predicting variable-length ACE inhibitory peptides based on graph convolutional network
    Rong, Yating
    Feng, Baolong
    Cai, Xiaoshuang
    Song, Hongjie
    Wang, Lili
    Wang, Yehui
    Yan, Xinxu
    Sun, Yulin
    Zhao, Jinyong
    Li, Ping
    Yang, Huihui
    Wang, Yutang
    Wang, Fengzhong
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2024, 282
  • [50] Ranking and significance of variable-length similarity-based time series motifs
    Serra, Joan
    Serra, Isabel
    Corral, Alvaro
    Lluis Arcos, Josep
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 55 : 452 - 460