Complex genome assembly based on long-read sequencing

被引:8
|
作者
Zhang, Tianjiao [1 ]
Zhou, Jie [1 ]
Gao, Wentao [1 ]
Jia, Yuran [1 ]
Wei, Yanan [1 ]
Wang, Guohua [1 ]
机构
[1] Northeast Forestry Univ China, Coll Informat & Comp Engn, Harbin, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
genome assembly; haplotype; long-read sequencing; DE-BRUIJN GRAPHS; ACCURATE;
D O I
10.1093/bib/bbac305
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Long-read sequencing and de novo genome assembly of Ammopiptanthus nanus, a desert shrub
    Gao, Fei
    Wang, Xue
    Li, Xuming
    Xu, Mingyue
    Li, Huayun
    Abla, Merhaba
    Sun, Huigai
    Wei, Shanjun
    Feng, Jinchao
    Zhou, Yijun
    GIGASCIENCE, 2018, 7 (07):
  • [22] Long-read sequencing and de novo genome assembly of marine medaka (Oryzias melastigma)
    Liang, Pingping
    Saqib, Hafiz Sohaib Ahmed
    Ni, Xiaomin
    Shen, Yingjia
    BMC GENOMICS, 2020, 21 (01)
  • [23] Long-read human genome sequencing and its applications
    Logsdon, Glennis A.
    Vollger, Mitchell R.
    Eichler, Evan E.
    NATURE REVIEWS GENETICS, 2020, 21 (10) : 597 - 614
  • [24] Advancements in long-read genome sequencing technologies and algorithms
    Espinosa, Elena
    Bautista, Rocio
    Larrosa, Rafael
    Plata, Oscar
    GENOMICS, 2024, 116 (03)
  • [25] Long-read human genome sequencing and its applications
    Glennis A. Logsdon
    Mitchell R. Vollger
    Evan E. Eichler
    Nature Reviews Genetics, 2020, 21 : 597 - 614
  • [26] Plastid Genome Assembly Using Long-read data
    Zhou, Wenbin
    Armijos, Carolina E.
    Lee, Chaehee
    Lu, Ruisen
    Wang, Jeremy
    Ruhlman, Tracey A.
    Jansen, Robert K.
    Jones, Alan M.
    Jones, Corbin D.
    MOLECULAR ECOLOGY RESOURCES, 2023, 23 (06) : 1442 - 1457
  • [27] Long-read genome sequencing resolves complex genomic rearrangements in rare genetic syndromes
    Showpnil, Iftekhar A.
    Gonzalez, Maria E. Hernandez
    Ramadesikan, Swetha
    Marhabaie, Mohammad
    Daley, Allison
    Dublin-Ryan, Leeran
    Pastore, Matthew T.
    Gurusamy, Umamaheswaran
    Hunter, Jesse M.
    Stone, Brandon S.
    Bartholomew, Dennis W.
    Manickam, Kandamurugu
    Miller, Anthony R.
    Wilson, Richard K.
    Stottmann, Rolf W.
    Koboldt, Daniel C.
    NPJ GENOMIC MEDICINE, 2024, 9 (01)
  • [28] Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
    Wenger, Aaron M.
    Peluso, Paul
    Rowell, William J.
    Chang, Pi-Chuan
    Hall, Richard J.
    Concepcion, Gregory T.
    Ebler, Jana
    Fungtammasan, Arkarachai
    Kolesnikov, Alexey
    Olson, Nathan D.
    Topfer, Armin
    Alonge, Michael
    Mahmoud, Medhat
    Qian, Yufeng
    Chin, Chen-Shan
    Phillippy, Adam M.
    Schate, Michael C.
    Myers, Gene
    DePristo, Mark A.
    Ruan, Jue
    Marschall, Tobias
    Sedlazeck, Fritz J.
    Zook, Justin M.
    Li, Heng
    Koren, Sergey
    Carroll, Andrew
    Rank, David R.
    Hunkapiller, Michael W.
    NATURE BIOTECHNOLOGY, 2019, 37 (10) : 1155 - +
  • [29] Identification of a terpene synthase arsenal using long-read sequencing and genome assembly of Aspergillus wentii
    Olumakaiye, Richard
    Corre, Christophe
    Alberti, Fabrizio
    BMC GENOMICS, 2024, 25 (01):
  • [30] An improved draft genome assembly of Meloidogyne graminicola IARI strain using long-read sequencing
    Somvanshi, Vishal Singh
    Dash, Manoranjan
    Bhat, Chaitra G.
    Budhwar, Roli
    Godwin, Jeffrey
    Shukla, Rohit N.
    Patrignani, Andrea
    Schlapbach, Ralph
    Rao, Uma
    GENE, 2021, 793