De novo diploid genome assembly using long noisy reads

被引:0
|
作者
Fan Nie
Peng Ni
Neng Huang
Jun Zhang
Zhenyu Wang
Chuanle Xiao
Feng Luo
Jianxin Wang
机构
[1] Central South University,School of Computer Science and Engineering
[2] Xiangjiang Laboratory,National Center for Applied Mathematics in Hunan and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education
[3] Xiangtan University,Hunan Provincial Key Lab on Bioinformatics
[4] Central South University,Institute of Nanfan & Seed Industry
[5] Guangdong Academy of Sciences,State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center
[6] Sun Yat-sen University #7 Jinsui Road,School of Computing
[7] Tianhe District,undefined
[8] Clemson University,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The high sequencing error rate has impeded the application of long noisy reads for diploid genome assembly. Most existing assemblers failed to generate high-quality phased assemblies using long noisy reads. Here, we present PECAT, a Phased Error Correction and Assembly Tool, for reconstructing diploid genomes from long noisy reads. We design a haplotype-aware error correction method that can retain heterozygote alleles while correcting sequencing errors. We combine a corrected read SNP caller and a raw read SNP caller to further improve the identification of inconsistent overlaps in the string graph. We use a grouping method to assign reads to different haplotype groups. PECAT efficiently assembles diploid genomes using Nanopore R9, PacBio CLR or Nanopore R10 reads only. PECAT generates more contiguous haplotype-specific contigs compared to other assemblers. Especially, PECAT achieves nearly haplotype-resolved assembly on B. taurus (Bison×Simmental) using Nanopore R9 reads and phase block NG50 with 59.4/58.0 Mb for HG002 using Nanopore R10 reads.
引用
收藏
相关论文
共 50 条
  • [31] Scalable De Novo Genome Assembly Using Pregel
    Yan, Da
    Chen, Hongzhi
    Cheng, James
    Cai, Zhenkun
    Shao, Bin
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1216 - 1219
  • [32] GALA: a computational framework for de novo chromosome-by-chromosome assembly with long reads
    Awad, Mohamed
    Gan, Xiangchao
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [33] GPU acceleration of Darwin read overlapper for de novo assembly of long DNA reads
    Ahmed, Nauman
    Qiu, Tong Dong
    Bertels, Koen
    Al-Ars, Zaid
    BMC BIOINFORMATICS, 2020, 21 (Suppl 13)
  • [34] Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
    Li, Heng
    BIOINFORMATICS, 2016, 32 (14) : 2103 - 2110
  • [35] Chromosome-level de novo genome assembly of Telopea speciosissima (New South Wales waratah) using long-reads, linked-reads and Hi-C
    Chen, Stephanie H.
    Rossetto, Maurizio
    van der Merwe, Marlien
    Lu-Irving, Patricia
    Yap, Jia-Yee S.
    Sauquet, Herve
    Bourke, Greg
    Amos, Timothy G.
    Bragg, Jason G.
    Edwards, Richard J.
    MOLECULAR ECOLOGY RESOURCES, 2022, 22 (05) : 1836 - 1854
  • [36] Strainline: full-length de novo viral haplotype reconstruction from noisy long reads
    Luo, Xiao
    Kang, Xiongbin
    Schoenhuth, Alexander
    GENOME BIOLOGY, 2022, 23 (01)
  • [37] Strainline: full-length de novo viral haplotype reconstruction from noisy long reads
    Xiao Luo
    Xiongbin Kang
    Alexander Schönhuth
    Genome Biology, 23
  • [38] Genome Sequencing and Assembly by Long Reads in Plants
    Li, Changsheng
    Lin, Feng
    An, Dong
    Wang, Wenqin
    Huang, Ruidong
    GENES, 2018, 9 (01):
  • [39] Author Correction: Rapid de novo assembly of the European eel genome from nanopore sequencing reads
    Hans J. Jansen
    Michael Liem
    Susanne A. Jong-Raadsen
    Sylvie Dufour
    Finn-Arne Weltzien
    William Swinkels
    Alex Koelewijn
    Arjan P. Palstra
    Bernd Pelster
    Herman P. Spaink
    Guido E. van den Thillart
    Ron P. Dirks
    Christiaan V. Henkel
    Scientific Reports, 9
  • [40] A long reads-based de-novo assembly of the genome of the Arlee homozygous line reveals chromosomal rearrangements in rainbow trout
    Gao, Guangtu
    Magadan, Susana
    Waldbieser, Geoffrey C.
    Youngblood, Ramey C.
    Wheeler, Paul A.
    Scheffler, Brian E.
    Thorgaard, Gary H.
    Palti, Yniv
    G3-GENES GENOMES GENETICS, 2021, 11 (04):