Complex genome assembly based on long-read sequencing

被引:7
|
作者
Zhang, Tianjiao [1 ]
Zhou, Jie [1 ]
Gao, Wentao [1 ]
Jia, Yuran [1 ]
Wei, Yanan [1 ]
Wang, Guohua [1 ]
机构
[1] Northeast Forestry Univ China, Coll Informat & Comp Engn, Harbin, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
genome assembly; haplotype; long-read sequencing; DE-BRUIJN GRAPHS; ACCURATE;
D O I
10.1093/bib/bbac305
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Improving the genome assembly of rabbits with long-read sequencing
    Bai, Yiqin
    Lin, Weili
    Xu, Jie
    Song, Jun
    Yang, Dongshan
    Chen, Y. Eugene
    Li, Lin
    Li, Yixue
    Wang, Zhen
    Zhang, Jifeng
    [J]. GENOMICS, 2021, 113 (05) : 3216 - 3223
  • [2] Comparison of long-read methods for sequencing and assembly of a plant genome
    Murigneux, Valentine
    Rai, Subash Kumar
    Furtado, Agnelo
    Bruxner, Timothy J. C.
    Tian, Wei
    Harliwong, Ivon
    Wei, Hanmin
    Yang, Bicheng
    Ye, Qianyu
    Anderson, Ellis
    Mao, Qing
    Drmanac, Radoje
    Wang, Ou
    Peters, Brock A.
    Xu, Mengyang
    Wu, Pei
    Topp, Bruce
    Coin, Lachlan J. M.
    Henry, Robert J.
    [J]. GIGASCIENCE, 2020, 9 (12):
  • [3] Long-read sequencing and de novo assembly of a Chinese genome
    Shi, Lingling
    Guo, Yunfei
    Dong, Chengliang
    Huddleston, John
    Yang, Hui
    Han, Xiaolu
    Fu, Aisi
    Li, Quan
    Li, Na
    Gong, Siyi
    Lintner, Katherine E.
    Ding, Qiong
    Wang, Zou
    Hu, Jiang
    Wang, Depeng
    Wang, Feng
    Wang, Lin
    Lyon, Gholson J.
    Guan, Yongtao
    Shen, Yufeng
    Evgrafov, Oleg V.
    Knowles, James A.
    Thibaud-Nissen, Francoise
    Schneider, Valerie
    Yu, Chack-Yung
    Zhou, Libing
    Eichler, Evan E.
    So, Kwok-Fai
    Wang, Kai
    [J]. NATURE COMMUNICATIONS, 2016, 7
  • [4] Long-read sequencing and de novo assembly of a Chinese genome
    Lingling Shi
    Yunfei Guo
    Chengliang Dong
    John Huddleston
    Hui Yang
    Xiaolu Han
    Aisi Fu
    Quan Li
    Na Li
    Siyi Gong
    Katherine E. Lintner
    Qiong Ding
    Zou Wang
    Jiang Hu
    Depeng Wang
    Feng Wang
    Lin Wang
    Gholson J. Lyon
    Yongtao Guan
    Yufeng Shen
    Oleg V. Evgrafov
    James A. Knowles
    Francoise Thibaud-Nissen
    Valerie Schneider
    Chack-Yung Yu
    Libing Zhou
    Evan E. Eichler
    Kwok-Fai So
    Kai Wang
    [J]. Nature Communications, 7
  • [5] Genome sequencing using long-read sequencing
    McEwen, Juan Guillermo
    Gomez, Oscar Mauricio
    [J]. REVISTA DE LA ACADEMIA COLOMBIANA DE CIENCIAS EXACTAS FISICAS Y NATURALES, 2023, 47 (183): : 439 - 444
  • [6] Long-Read Annotation: Automated Eukaryotic Genome Annotation Based on Long-Read cDNA Sequencing
    Cook, David E.
    Valle-Inclan, Jose Espejo
    Pajoro, Alice
    Rovenich, Hanna
    Thomma, Bart P. H. J.
    Faino, Luigi
    [J]. PLANT PHYSIOLOGY, 2019, 179 (01) : 38 - 54
  • [7] Whole Genome Assembly of Human Papillomavirus by Nanopore Long-Read Sequencing
    Yang, Shuaibing
    Zhao, Qianqian
    Tang, Lihua
    Chen, Zejia
    Wu, Zhaoting
    Li, Kaixin
    Lin, Ruoru
    Chen, Yang
    Ou, Danlin
    Zhou, Li
    Xu, Jianzhen
    Qin, Qingsong
    [J]. FRONTIERS IN GENETICS, 2022, 12
  • [8] Long-read sequencing and de novo assembly of the cynomolgus macaque genome
    Bai, Bing
    Wang, Yi
    Zhu, Ran
    Zhang, Yaolei
    Wang, Hong
    Fan, Guangyi
    Liu, Xin
    Shi, Hong
    Niu, Yuyu
    Ji, Weizhi
    [J]. JOURNAL OF GENETICS AND GENOMICS, 2022, 49 (10) : 975 - 978
  • [9] Democratizing long-read genome assembly
    Kirsche, Melanie
    Schatz, Michael C.
    [J]. CELL SYSTEMS, 2021, 12 (10) : 945 - 947
  • [10] Long-read sequencing and de novo assembly of the cynomolgus macaque genome
    Bing Bai
    Yi Wang
    Ran Zhu
    Yaolei Zhang
    Hong Wang
    Guangyi Fan
    Xin Liu
    Hong Shi
    Yuyu Niu
    Weizhi Ji
    [J]. Journal of Genetics and Genomics, 2022, 49 (10) : 975 - 978