Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal

被引:12
|
作者
Etherington, Graham J. [1 ]
Heavens, Darren [1 ]
Baker, David [1 ]
Lister, Ashleigh [1 ]
McNelly, Rose [1 ]
Garcia, Gonzalo [1 ]
Clavijo, Bernardo [1 ]
Macaulay, Iain [1 ]
Haerty, Wilfried [1 ]
Di Palma, Federica [1 ]
机构
[1] Norwich Res Pk, Earlham Inst, Norwich NR4 7UZ, Norfolk, England
来源
GIGASCIENCE | 2020年 / 9卷 / 05期
基金
英国生物技术与生命科学研究理事会;
关键词
polecat; vertebrate; non-model organism; Illumina; chromium; Bionano; assembly; sequencing; POLECAT MUSTELA-PUTORIUS; CONSERVATION; GENOMICS; ANNOTATION; BIOLOGY;
D O I
10.1093/gigascience/giaa045
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. Results: Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. Conclusions: The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] De novo transcriptome sequencing of a non-model polychaete species
    Cannarsa, E.
    Zampicinini, G.
    Friard, O.
    Santovito, A.
    Cervella, P.
    MARINE GENOMICS, 2016, 29 : 31 - 34
  • [2] A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly
    Francis, Warren R.
    Christianson, Lynne M.
    Kiko, Rainer
    Powers, Meghan L.
    Shaner, Nathan C.
    Haddock, Steven H. D.
    BMC GENOMICS, 2013, 14 : 1 - 12
  • [3] Using de novo genome assembly and high-throughput sequencing to characterize the MHC region in a non-model bird, the Eurasian coot
    Pikus, Ewa
    Minias, Piotr
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [4] A comparison across non-model animals suggests an optimal sequencing depth for de novotranscriptome assembly
    Warren R Francis
    Lynne M Christianson
    Rainer Kiko
    Meghan L Powers
    Nathan C Shaner
    Steven H D Haddock
    BMC Genomics, 14
  • [5] Using de novo genome assembly and high-throughput sequencing to characterize the MHC region in a non-model bird, the Eurasian coot
    Ewa Pikus
    Piotr Minias
    Scientific Reports, 12
  • [6] MobiSeq: De novo SNP discovery in model and non-model species through sequencing the flanking region of transposable elements
    Rey-Iglesia, Alba
    Gopalakrishan, Shyam
    Caroe, Christian
    Alquezar-Planas, David E.
    Ahlmann Nielsen, Anne
    Roder, Timo
    Pedersen, Lene Bruhn
    Naesborg-Nielsen, Christina
    Sinding, Mikkel-Holger S.
    Rath, Martin Fredensborg
    Li, Zhipeng
    Petersen, Bent
    Gilbert, M. Thomas P.
    Bunce, Michael
    Mourier, Tobias
    Hansen, Anders Johannes
    MOLECULAR ECOLOGY RESOURCES, 2019, 19 (02) : 512 - 525
  • [7] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
    Berat Z Haznedaroglu
    Darryl Reeves
    Hamid Rismani-Yazdi
    Jordan Peccia
    BMC Bioinformatics, 13
  • [8] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
    Haznedaroglu, Berat Z.
    Reeves, Darryl
    Rismani-Yazdi, Hamid
    Peccia, Jordan
    BMC BIOINFORMATICS, 2012, 13
  • [9] Next-generation biology: Sequencing and data analysis approaches for non-model organisms
    da Fonseca, Rute R.
    Albrechtsen, Anders
    Themudo, Goncalo Espregueira
    Ramos-Madrigal, Jazmin
    Sibbesen, Jonas Andreas
    Maretty, Lasse
    Zepeda-Mendoza, M. Lisandra
    Campos, Paula F.
    Heller, Rasmus
    Pereira, Ricardo J.
    MARINE GENOMICS, 2016, 30 : 3 - 13
  • [10] Genome sequencing and population genomics in non-model organisms
    Ellegren, Hans
    TRENDS IN ECOLOGY & EVOLUTION, 2014, 29 (01) : 51 - 63