Comparative Evaluation of Genome Assemblers from Long-Read Sequencing for Plants and Crops

被引:17
|
作者
Jung, Hyungtaek [1 ]
Jeon, Min-Seung [2 ]
Hodgett, Matthew [3 ]
Waterhouse, Peter [1 ]
Eyun, Seong-il [2 ]
机构
[1] Queensland Univ Technol, Ctr Agr & Biocommod, Brisbane, Qld 4001, Australia
[2] Chung Ang Univ, Dept Life Sci, Seoul 06974, South Korea
[3] Queensland Univ Technol, Informat Technol Serv, Brisbane, Qld 4001, Australia
基金
澳大利亚研究理事会;
关键词
plant genome; next-generation sequencing; Pacific Biosciences; long reads; nanopore; assemblers;
D O I
10.1021/acs.jafc.0c01647
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
The availability of recent state-of-the-art long-read sequencing technologies has significantly increased the ease and speed of producing high-quality plant genome assemblies. A wide variety of genome-related software tools are now available and they are typically benchmarked using microbial or model eukaryotic genomes such as Arabidopsis and rice. However, many plant species have much larger and more complex genomes than these, and the choice of tools, parameters, and/or strategies that can be used is not always obvious. Thus, we have compared the metrics of assemblies generated by various pipelines to discuss how assembly quality can be affected by two different assembly strategies. First, we focused on optimizing read preprocessing and assembler variables using eight different de novo assemblers on five different Pacific Biosciences long-read datasets of diploid and tetraploid species. Then, we examined a single scaffolding tool (quickmerge) that has been employed for the postprocessing step. We then merged the outputs from multiple assemblies to produce a higher quality consensus assembly. Then, we benchmarked the assemblies for completeness and accuracy (assembly metrics and BUSCO), computer memory, and CPU times. Two lightweight assemblers, Miniasm/Minimap/Racon and WTDBG, were deemed good for novice users because they involved smaller required learning curves and light computational resources. However, two heavyweight tools, CANU and Flye, should be the first choice when the goal is to achieve accurate and complete assemblies. Our results will provide valuable guidance in future plant genome projects and beyond.
引用
下载
收藏
页码:7670 / 7677
页数:8
相关论文
共 50 条
  • [41] The Application of Long-Read Sequencing to Cancer
    Ermini, Luca
    Driguez, Patrick
    CANCERS, 2024, 16 (07)
  • [42] Democratizing long-read genome assembly
    Kirsche, Melanie
    Schatz, Michael C.
    CELL SYSTEMS, 2021, 12 (10) : 945 - 947
  • [43] CoRAL Accurately Resolves Extrachromosomal DNA Genome Structures with Long-Read Sequencing
    Zhu, Kaiyuan
    Jones, Matthew G.
    Luebeck, Jens
    Bu, Xinxin
    Yi, Hyerim
    Hung, King L.
    Wong, Ivy Tsz-Lo
    Zhang, Shu
    Mischel, Paul S.
    Chang, Howard Y.
    Bafna, Vineet
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2024, 2024, 14758 : 454 - 457
  • [44] SVLR: Genome Structural Variant Detection Using Long-Read Sequencing Data
    Gu, Wenyan
    Zhou, Aizhong
    Wang, Lusheng
    Sun, Shiwei
    Cui, Xuefeng
    Zhu, Daming
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2021, 28 (08) : 774 - 788
  • [45] Long-read RNA sequencing can probe organelle genome pervasive transcription
    Lima, Matheus Sanita
    Silva Domingues, Douglas
    Rossi Paschoal, Alexandre
    Smith, David Roy
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2024,
  • [46] Long-Read Sequencing and Multiplex ddPCR for Viral Vector Genome Integrity Identification
    Janc, Mojca
    Deuric, Jana
    Dobnik, David
    MOLECULAR THERAPY, 2022, 30 (04) : 590 - 590
  • [47] Genome profiling with targeted adaptive sampling long-read sequencing for pediatric leukemia
    Kato, Shota
    Sato-Otsubo, Aiko
    Nakamura, Wataru
    Sugawa, Masahiro
    Okada, Ai
    Chiba, Kenichi
    Takasugi, Nao
    Irikura, Tomoya
    Hidaka, Moe
    Sekiguchi, Masahiro
    Watanabe, Kentaro
    Shiraishi, Yuichi
    Kato, Motohiro
    BLOOD CANCER JOURNAL, 2024, 14 (01):
  • [48] Long-read genome sequencing identifies causal structural variation in a Mendelian disease
    Merker, Jason D.
    Wenger, Aaron M.
    Sneddon, Tam
    Grove, Megan
    Zappala, Zachary
    Fresard, Laure
    Waggott, Daryl
    Utiramerur, Sowmi
    Hou, Yanli
    Smith, Kevin S.
    Montgomery, Stephen B.
    Wheeler, Matthew
    Buchan, Jillian G.
    Lambert, Christine C.
    Eng, Kevin S.
    Hickey, Luke
    Korlach, Jonas
    Ford, James
    Ashley, Euan A.
    GENETICS IN MEDICINE, 2018, 20 (01) : 159 - 163
  • [49] Long-Read Genome Sequencing and Assembly of Leptopilina boulardi: A Specialist Drosophila Parasitoid
    Khan, Shagufta
    Sowpati, Divya Tej
    Srinivasan, Arumugam
    Soujanya, Mamilla
    Mishra, Rakesh K.
    G3-GENES GENOMES GENETICS, 2020, 10 (05): : 1485 - 1494
  • [50] Genome Announcement: Draft Genome Assembly of Heterodera humuli Generated Using Long-Read Sequencing
    Nunez-Rodriguez, Lester A.
    Wram, Catherine L.
    Hesse, Cedar
    Zasada, Inga A.
    JOURNAL OF NEMATOLOGY, 2024, 56 (01)