A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly

被引:66
|
作者
Francis, Warren R. [1 ,2 ]
Christianson, Lynne M. [1 ]
Kiko, Rainer [3 ]
Powers, Meghan L. [1 ,2 ]
Shaner, Nathan C. [4 ]
Haddock, Steven H. D. [1 ]
机构
[1] Monterey Bay Aquarium Res Inst, Moss Landing, CA 95039 USA
[2] Univ Calif Santa Cruz, Dept Ocean Sci, Santa Cruz, CA 95064 USA
[3] GEOMAR, Helmholtz Ctr Ocean Res Kiel, D-24105 Kiel, Germany
[4] Scintillon Inst, San Diego, CA 92121 USA
来源
BMC GENOMICS | 2013年 / 14卷
关键词
RNA-SEQ DATA; DIFFERENTIAL EXPRESSION; GENES; NORMALIZATION;
D O I
10.1186/1471-2164-14-167
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The lack of genomic resources can present challenges for studies of non-model organisms. Transcriptome sequencing offers an attractive method to gather information about genes and gene expression without the need for a reference genome. However, it is unclear what sequencing depth is adequate to assemble the transcriptome de novo for these purposes. Results: We assembled transcriptomes of animals from six different phyla (Annelids, Arthropods, Chordates, Cnidarians, Ctenophores, and Molluscs) at regular increments of reads using Velvet/Oases and Trinity to determine how read count affects the assembly. This included an assembly of mouse heart reads because we could compare those against the reference genome that is available. We found qualitative differences in the assemblies of whole-animals versus tissues. With increasing reads, whole-animal assemblies show rapid increase of transcripts and discovery of conserved genes, while single-tissue assemblies show a slower discovery of conserved genes though the assembled transcripts were often longer. A deeper examination of the mouse assemblies shows that with more reads, assembly errors become more frequent but such errors can be mitigated with more stringent assembly parameters. Conclusions: These assembly trends suggest that representative assemblies are generated with as few as 20 million reads for tissue samples and 30 million reads for whole-animals for RNA-level coverage. These depths provide a good balance between coverage and noise. Beyond 60 million reads, the discovery of new genes is low and sequencing errors of highly-expressed genes are likely to accumulate. Finally, siphonophores (polymorphic Cnidarians) are an exception and possibly require alternate assembly strategies.
引用
收藏
页码:1 / 12
页数:11
相关论文
共 50 条
  • [1] A comparison across non-model animals suggests an optimal sequencing depth for de novotranscriptome assembly
    Warren R Francis
    Lynne M Christianson
    Rainer Kiko
    Meghan L Powers
    Nathan C Shaner
    Steven H D Haddock
    BMC Genomics, 14
  • [2] De novo transcriptome sequencing of a non-model polychaete species
    Cannarsa, E.
    Zampicinini, G.
    Friard, O.
    Santovito, A.
    Cervella, P.
    MARINE GENOMICS, 2016, 29 : 31 - 34
  • [3] Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
    Etherington, Graham J.
    Heavens, Darren
    Baker, David
    Lister, Ashleigh
    McNelly, Rose
    Garcia, Gonzalo
    Clavijo, Bernardo
    Macaulay, Iain
    Haerty, Wilfried
    Di Palma, Federica
    GIGASCIENCE, 2020, 9 (05):
  • [4] Assembly and annotation of a non-model gastropod (Nerita melanotragus) transcriptome: A comparison of de novo assemblers
    Amin S.
    Prentis P.J.
    Gilding E.K.
    Pavasovic A.
    BMC Research Notes, 7 (1)
  • [5] A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms
    Sze, Sing-Hoi
    Pimsler, Meaghan L.
    Tomberlin, Jeffery K.
    Jones, Corbin D.
    Tarone, Aaron M.
    BMC GENOMICS, 2017, 18
  • [6] A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms
    Sing-Hoi Sze
    Meaghan L. Pimsler
    Jeffery K. Tomberlin
    Corbin D. Jones
    Aaron M. Tarone
    BMC Genomics, 18
  • [7] Comparing de novo transcriptome assembly tools in di- and autotetraploid non-model plant species
    Silvia Madritsch
    Agnes Burg
    Eva M. Sehr
    BMC Bioinformatics, 22
  • [8] Comparing de novo transcriptome assembly tools in di- and autotetraploid non-model plant species
    Madritsch, Silvia
    Burg, Agnes
    Sehr, Eva M.
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [9] Reference-free transcriptome assembly in non-model animals from next-generation sequencing data
    Cahais, V.
    Gayral, P.
    Tsagkogeorga, G.
    Melo-Ferreira, J.
    Ballenghien, M.
    Weinert, L.
    Chiari, Y.
    Belkhir, K.
    Ranwez, V.
    Galtier, N.
    MOLECULAR ECOLOGY RESOURCES, 2012, 12 (05) : 834 - 845
  • [10] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
    Berat Z Haznedaroglu
    Darryl Reeves
    Hamid Rismani-Yazdi
    Jordan Peccia
    BMC Bioinformatics, 13