Comparative Evaluation of Genome Assemblers from Long-Read Sequencing for Plants and Crops

被引:17
|
作者
Jung, Hyungtaek [1 ]
Jeon, Min-Seung [2 ]
Hodgett, Matthew [3 ]
Waterhouse, Peter [1 ]
Eyun, Seong-il [2 ]
机构
[1] Queensland Univ Technol, Ctr Agr & Biocommod, Brisbane, Qld 4001, Australia
[2] Chung Ang Univ, Dept Life Sci, Seoul 06974, South Korea
[3] Queensland Univ Technol, Informat Technol Serv, Brisbane, Qld 4001, Australia
基金
澳大利亚研究理事会;
关键词
plant genome; next-generation sequencing; Pacific Biosciences; long reads; nanopore; assemblers;
D O I
10.1021/acs.jafc.0c01647
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
The availability of recent state-of-the-art long-read sequencing technologies has significantly increased the ease and speed of producing high-quality plant genome assemblies. A wide variety of genome-related software tools are now available and they are typically benchmarked using microbial or model eukaryotic genomes such as Arabidopsis and rice. However, many plant species have much larger and more complex genomes than these, and the choice of tools, parameters, and/or strategies that can be used is not always obvious. Thus, we have compared the metrics of assemblies generated by various pipelines to discuss how assembly quality can be affected by two different assembly strategies. First, we focused on optimizing read preprocessing and assembler variables using eight different de novo assemblers on five different Pacific Biosciences long-read datasets of diploid and tetraploid species. Then, we examined a single scaffolding tool (quickmerge) that has been employed for the postprocessing step. We then merged the outputs from multiple assemblies to produce a higher quality consensus assembly. Then, we benchmarked the assemblies for completeness and accuracy (assembly metrics and BUSCO), computer memory, and CPU times. Two lightweight assemblers, Miniasm/Minimap/Racon and WTDBG, were deemed good for novice users because they involved smaller required learning curves and light computational resources. However, two heavyweight tools, CANU and Flye, should be the first choice when the goal is to achieve accurate and complete assemblies. Our results will provide valuable guidance in future plant genome projects and beyond.
引用
下载
收藏
页码:7670 / 7677
页数:8
相关论文
共 50 条
  • [1] Benchmarking of long-read sequencing, assemblers and polishers for yeast genome
    Zhang, Xue
    Liu, Chen-Guang
    Yang, Shi-Hui
    Wang, Xia
    Bai, Feng-Wu
    Wang, Zhuo
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [2] Genome sequencing using long-read sequencing
    McEwen, Juan Guillermo
    Gomez, Oscar Mauricio
    REVISTA DE LA ACADEMIA COLOMBIANA DE CIENCIAS EXACTAS FISICAS Y NATURALES, 2023, 47 (183): : 439 - 444
  • [3] Long-Read Annotation: Automated Eukaryotic Genome Annotation Based on Long-Read cDNA Sequencing
    Cook, David E.
    Valle-Inclan, Jose Espejo
    Pajoro, Alice
    Rovenich, Hanna
    Thomma, Bart P. H. J.
    Faino, Luigi
    PLANT PHYSIOLOGY, 2019, 179 (01) : 38 - 54
  • [4] Long-read human genome sequencing and its applications
    Logsdon, Glennis A.
    Vollger, Mitchell R.
    Eichler, Evan E.
    NATURE REVIEWS GENETICS, 2020, 21 (10) : 597 - 614
  • [5] Advancements in long-read genome sequencing technologies and algorithms
    Espinosa, Elena
    Bautista, Rocio
    Larrosa, Rafael
    Plata, Oscar
    GENOMICS, 2024, 116 (03)
  • [6] Improving the genome assembly of rabbits with long-read sequencing
    Bai, Yiqin
    Lin, Weili
    Xu, Jie
    Song, Jun
    Yang, Dongshan
    Chen, Y. Eugene
    Li, Lin
    Li, Yixue
    Wang, Zhen
    Zhang, Jifeng
    GENOMICS, 2021, 113 (05) : 3216 - 3223
  • [7] Long-read human genome sequencing and its applications
    Glennis A. Logsdon
    Mitchell R. Vollger
    Evan E. Eichler
    Nature Reviews Genetics, 2020, 21 : 597 - 614
  • [8] Complex genome assembly based on long-read sequencing
    Zhang, Tianjiao
    Zhou, Jie
    Gao, Wentao
    Jia, Yuran
    Wei, Yanan
    Wang, Guohua
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (05)
  • [9] Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing
    Chen, Zhao
    Erickson, David L.
    Meng, Jianghong
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (23) : 1 - 27
  • [10] Comparison of long-read methods for sequencing and assembly of a plant genome
    Murigneux, Valentine
    Rai, Subash Kumar
    Furtado, Agnelo
    Bruxner, Timothy J. C.
    Tian, Wei
    Harliwong, Ivon
    Wei, Hanmin
    Yang, Bicheng
    Ye, Qianyu
    Anderson, Ellis
    Mao, Qing
    Drmanac, Radoje
    Wang, Ou
    Peters, Brock A.
    Xu, Mengyang
    Wu, Pei
    Topp, Bruce
    Coin, Lachlan J. M.
    Henry, Robert J.
    GIGASCIENCE, 2020, 9 (12):