AsmMix: an efficient haplotype-resolved hybrid de novo genome assembling pipeline

被引:0
|
作者
Liu, Chao [1 ,2 ]
Wu, Pei [1 ,2 ]
Wu, Xue [2 ]
Zhao, Xia [3 ]
Chen, Fang [3 ]
Cheng, Xiaofang [3 ]
Zhu, Hongmei [1 ,2 ]
Wang, Ou [2 ]
Xu, Mengyang [2 ,4 ]
机构
[1] BGI, Tianjin, Peoples R China
[2] BGI Res, Shenzhen, Peoples R China
[3] MGI Tech, Shenzhen, Peoples R China
[4] BGI Res, Qingdao, Peoples R China
基金
中国国家自然科学基金;
关键词
long reads; bioinformatics; de novo; genome assembly; haplotype; hybrid; LONG; ACCURATE; READS;
D O I
10.3389/fgene.2024.1421565
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Accurate haplotyping facilitates distinguishing allele-specific expression, identifying cis-regulatory elements, and characterizing genomic variations, which enables more precise investigations into the relationship between genotype and phenotype. Recent advances in third-generation single-molecule long read and synthetic co-barcoded read sequencing techniques have harnessed long-range information to simplify the assembly graph and improve assembly genomic sequence. However, it remains methodologically challenging to reconstruct the complete haplotypes due to high sequencing error rates of long reads and limited capturing efficiency of co-barcoded reads. We here present a pipeline, AsmMix, for generating both contiguous and accurate diploid genomes. It first assembles co-barcoded reads to generate accurate haplotype-resolved assemblies that may contain many gaps, while the long-read assembly is contiguous but susceptible to errors. Then two assembly sets are integrated into haplotype-resolved assemblies with reduced misassembles. Through extensive evaluation on multiple synthetic datasets, AsmMix consistently demonstrates high precision and recall rates for haplotyping across diverse sequencing platforms, coverage depths, read lengths, and read accuracies, significantly outperforming other existing tools in the field. Furthermore, we validate the effectiveness of our pipeline using a human whole genome dataset (HG002), and produce highly contiguous, accurate, and haplotype-resolved assemblies. These assemblies are evaluated using the GIAB benchmarks, confirming the accuracy of variant calling. Our results demonstrate that AsmMix offers a straightforward yet highly efficient approach that effectively leverages both long reads and co-barcoded reads for haplotype-resolved assembly.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] De novo assembly of a haplotype-resolved human genome
    Cao, Hongzhi
    Wu, Honglong
    Luo, Ruibang
    Huang, Shujia
    Sun, Yuhui
    Tong, Xin
    Xie, Yinlong
    Liu, Binghang
    Yang, Hailong
    Zheng, Hancheng
    Li, Jian
    Li, Bo
    Wang, Yu
    Yang, Fang
    Sun, Peng
    Liu, Siyang
    Gao, Peng
    Huang, Haodong
    Sun, Jing
    Chen, Dan
    He, Guangzhu
    Huang, Weihua
    Huang, Zheng
    Li, Yue
    Tellier, Laurent C. A. M.
    Liu, Xiao
    Feng, Qiang
    Xu, Xun
    Zhang, Xiuqing
    Bolund, Lars
    Krogh, Anders
    Kristiansen, Karsten
    Drmanac, Radoje
    Drmanac, Snezana
    Nielsen, Rasmus
    Li, Songgang
    Wang, Jian
    Yang, Huanming
    Li, Yingrui
    Wong, Gane Ka-Shu
    Wang, Jun
    NATURE BIOTECHNOLOGY, 2015, 33 (06) : 617 - +
  • [2] De novo assembly of a haplotype-resolved human genome
    Hongzhi Cao
    Honglong Wu
    Ruibang Luo
    Shujia Huang
    Yuhui Sun
    Xin Tong
    Yinlong Xie
    Binghang Liu
    Hailong Yang
    Hancheng Zheng
    Jian Li
    Bo Li
    Yu Wang
    Fang Yang
    Peng Sun
    Siyang Liu
    Peng Gao
    Haodong Huang
    Jing Sun
    Dan Chen
    Guangzhu He
    Weihua Huang
    Zheng Huang
    Yue Li
    Laurent C A M Tellier
    Xiao Liu
    Qiang Feng
    Xun Xu
    Xiuqing Zhang
    Lars Bolund
    Anders Krogh
    Karsten Kristiansen
    Radoje Drmanac
    Snezana Drmanac
    Rasmus Nielsen
    Songgang Li
    Jian Wang
    Huanming Yang
    Yingrui Li
    Gane Ka-Shu Wong
    Jun Wang
    Nature Biotechnology, 2015, 33 : 617 - 622
  • [3] Hybrid de novo and haplotype-resolved genome assembly of Vechur cattle - elucidating genetic variation
    Muthusamy, Poorvishaa V.
    Vakayil Mani, Rajesh
    Kumari, Shivani
    Kaur, Manpreet
    Bhaskar, Balu
    Raghavan Pillai, Rajeev
    Sajeev Kumar, Thankappan
    Anilkumar, Thapasimuthu Vijayamma
    Singh, Nongmaithem Sadananda
    FRONTIERS IN GENETICS, 2024, 15
  • [4] Haplotype-resolved de novo assembly of the Vero cell line genome
    Sene, Marie-Angelique
    Kiesslich, Sascha
    Djambazian, Haig
    Ragoussis, Jiannis
    Xia, Yu
    Kamen, Amine A.
    NPJ VACCINES, 2021, 6 (01)
  • [5] Haplotype-resolved de novo assembly of the Vero cell line genome
    Marie-Angélique Sène
    Sascha Kiesslich
    Haig Djambazian
    Jiannis Ragoussis
    Yu Xia
    Amine A. Kamen
    npj Vaccines, 6
  • [6] Haplotype-resolved de novo genome assemblies of four coniferous tree species
    Shirasawa, Kenta
    Mishima, Kentaro
    Hirakawa, Hideki
    Hirao, Tomonori
    Tsubomura, Miyoko
    Nagano, Soichiro
    Iki, Taiichi
    Isobe, Sachiko
    Takahashi, Makoto
    JOURNAL OF FOREST RESEARCH, 2024, 29 (02) : 151 - 157
  • [7] De novo assembly of haplotype-resolved genomes with trio binning
    Koren, Sergey
    Rhie, Arang
    Walenz, Brian P.
    Dilthey, Alexander T.
    Bickhart, Derek M.
    Kingan, Sarah B.
    Hiendleder, Stefan
    Williams, John L.
    Smith, Timothy P. L.
    Phillippy, Adam M.
    NATURE BIOTECHNOLOGY, 2018, 36 (12) : 1174 - +
  • [8] De novo assembly of haplotype-resolved genomes with trio binning
    Sergey Koren
    Arang Rhie
    Brian P Walenz
    Alexander T Dilthey
    Derek M Bickhart
    Sarah B Kingan
    Stefan Hiendleder
    John L Williams
    Timothy P L Smith
    Adam M Phillippy
    Nature Biotechnology, 2018, 36 : 1174 - 1182
  • [9] A haplotype-resolved, de novo genome assembly for the wood tiger moth (Arctia plantaginis) through trio binning
    Yen, Eugenie C.
    McCarthy, Shane A.
    Galarza, Juan A.
    Generalovic, Tomas N.
    Pelan, Sarah
    Nguyen, Petr
    Meier, Joana I.
    Warren, Ian A.
    Mappes, Johanna
    Durbin, Richard
    Jiggins, Chris D.
    GIGASCIENCE, 2020, 9 (08):
  • [10] Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm
    Cheng, Haoyu
    Concepcion, Gregory T.
    Feng, Xiaowen
    Zhang, Haowen
    Li, Heng
    NATURE METHODS, 2021, 18 (02) : 170 - +