Optimizing k-mer size using a variant grid search to enhance de novo genome assembly

被引:0
|
作者
Cha, Soyeon
Bird, David McK [1 ]
机构
[1] NC State Univ, Bioinformat Res Ctr, Raleigh, NC 27695 USA
关键词
ABySS; CEGMA; contigs; KmerGenie; N50; next-generation sequencing; SOAPdonovo; Velvet;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Largely driven by huge reductions in per-base costs, sequencing nucleic adds has become a near-ubiquitous technique in laboratories performing biological and biomedical research. Most of the effort goes to re-sequencing, but assembly of de novo-generated, raw sequence reads into contigs that span as much of the genome as possible is central to many projects. Although truly complete coverage is not realistically attainable, maximizing the amount of sequence that can be correctly assembled into contigs contributes to coverage. Here we compare three commonly used assembly algorithms (ABySS, Velvet and SOAPdenovo2), and show that empirical optimization of k-mer values has a disproportionate influence on de novo assembly of a eukaryotic genome, the nematode parasite Meloidogynechitwoodi. Each assembler was challenged with similar to 40 million Iluumina II paired-end reads, and assemblies performed under a range of k-mer sizes. In each instance, the optimal k-mer was 127, although based on N50 values,ABySS was more efficient than the others. That the assembly was not spurious was established using the "Core Eukaryotic Gene Mapping Approach", which indicated that 98.79% of the M. chitwoodi genome was accounted for by the assembly. Subsequent gene finding and annotation are consistent with this and suggest that k-mer optimization contributes to the robustness of assembly.
引用
收藏
页码:36 / 40
页数:5
相关论文
共 50 条
  • [41] De novo diploid genome assembly using long noisy reads
    Nie, Fan
    Ni, Peng
    Huang, Neng
    Zhang, Jun
    Wang, Zhenyu
    Xiao, Chuanle
    Luo, Feng
    Wang, Jianxin
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [42] EPGA: de novo assembly using the distributions of reads and insert size
    Luo, Junwei
    Wang, Jianxin
    Zhang, Zhen
    Wu, Fang-Xiang
    Li, Min
    Pan, Yi
    BIOINFORMATICS, 2015, 31 (06) : 825 - 833
  • [43] A general near-exact k-mer counting method with low memory consumption enables de novo assembly of 106x human sequence data in 2.7 hours
    Shi, Christina Huan
    Yip, Kevin Y.
    BIOINFORMATICS, 2020, 36 : I625 - I633
  • [44] Nanopore Sequencing for De Novo Bacterial Genome Assembly and Search for Single-Nucleotide Polymorphism
    Khrenova, Maria G.
    Panova, Tatiana, V
    Rodin, Vladimir A.
    Kryakvin, Maxim A.
    Lukyanov, Dmitrii A.
    Osterman, Ilya A.
    Zvereva, Maria, I
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (15)
  • [45] Optimizing Information in Next-Generation-Sequencing (NGS) Reads for Improving De Novo Genome Assembly
    Liu, Tsunglin
    Tsai, Cheng-Hung
    Lee, Wen-Bin
    Chiang, Jung-Hsien
    PLOS ONE, 2013, 8 (07):
  • [46] Estimation of Genome Size in the Endemic Species Reseda pentagyna and the Locally Rare Species Reseda lutea Using comparative Analyses of Flow Cytometry and K-Mer Approaches
    Al-Qurainy, Fahad
    Gaafar, Abdel-Rhman Z.
    Khan, Salim
    Nadeem, Mohammad
    Alshameri, Aref M.
    Tarroum, Mohamed
    Alansi, Saleh
    Almarri, Naser B.
    Alfarraj, Norah S.
    PLANTS-BASEL, 2021, 10 (07):
  • [47] De Novo Assembly of the Complete Genome of an Enhanced Electricity-Producing Variant of Geobacter sulfurreducens Using Only Short Reads
    Nagarajan, Harish
    Butler, Jessica E.
    Klimes, Anna
    Qiu, Yu
    Zengler, Karsten
    Ward, Joy
    Young, Nelson D.
    Methe, Barbara A.
    Palsson, Bernhard O.
    Lovley, Derek R.
    Barrett, Christian L.
    PLOS ONE, 2010, 5 (06):
  • [48] De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences
    Izan, Shairul
    Esselink, Danny
    Visser, Richard G. F.
    Smulders, Marinus J. M.
    Borm, Theo
    FRONTIERS IN PLANT SCIENCE, 2017, 8
  • [49] De novo assembly and annotation of the North American bison (Bison bison) reference genome and subsequent variant identification
    Dobson, L. K.
    Zimin, A.
    Bayles, D.
    Fritz-Waters, E.
    Alt, D.
    Olsen, S.
    Blanchong, J.
    Reecy, J.
    Smith, T. P. L.
    Derr, J. N.
    ANIMAL GENETICS, 2021, 52 (03) : 263 - 274
  • [50] De novo assembly of a new Olea europaea genome accession using nanopore sequencing
    Rao, Guodong
    Zhang, Jianguo
    Liu, Xiaoxia
    Lin, Chunfu
    Xin, Huaigen
    Xue, Li
    Wang, Chenhe
    HORTICULTURE RESEARCH, 2021, 8 (01)