Assembly-free quantification of vagrant DNA inserts

被引:0
|
作者
Becher, Hannes [1 ]
Nichols, Richard A. [2 ]
机构
[1] Univ Edinburgh, Inst Genet & Canc, Edinburgh, Scotland
[2] Queen Mary Univ London, Sch Biol & Behav Sci, London, England
关键词
endosymbionts; genome skimming; nuclear pseudogenes; NUMTs; NUPTs; quantification; GRASSHOPPER PODISMA-PEDESTRIS; SEX-CHROMOSOME POLYMORPHISM; MITOCHONDRIAL-DNA; GENOME; NUMTS; PSEUDOGENES; RACES; CHLOROPLAST; SEQUENCES; ALIGNMENT;
D O I
10.1111/1755-0998.13764
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Inserts of DNA from extranuclear sources, such as organelles and microbes, are common in eukaryote nuclear genomes. However, sequence similarity between the nuclear and extranuclear DNA, and a history of multiple insertions, make the assembly of these regions challenging. Consequently, the number, sequence and location of these vagrant DNAs cannot be reliably inferred from the genome assemblies of most organisms. We introduce two statistical methods to estimate the abundance of nuclear inserts even in the absence of a nuclear genome assembly. The first (intercept method) only requires low-coverage (<1x) sequencing data, as commonly generated for population studies of organellar and ribosomal DNAs. The second method additionally requires that a subset of the individuals carry extranuclear DNA with diverged genotypes. We validated our intercept method using simulations and by re-estimating the frequency of human NUMTs (nuclear mitochondrial inserts). We then applied it to the grasshopper Podisma pedestris, exceptional for both its large genome size and reports of numerous NUMT inserts, estimating that NUMTs make up 0.056% of the nuclear genome, equivalent to >500 times the mitochondrial genome size. We also re-analysed a museomics data set of the parrot Psephotellus varius, obtaining an estimate of only 0.0043%, in line with reports from other species of bird. Our study demonstrates the utility of low-coverage high-throughput sequencing data for the quantification of nuclear vagrant DNAs. Beyond quantifying organellar inserts, these methods could also be used on endosymbiont-derived sequences. We provide an R implementation of our methods called "vagrantDNA" and code to simulate test data sets.
引用
收藏
页码:1002 / 1013
页数:12
相关论文
共 50 条
  • [1] Assembly-free phylogenetic trees
    Tang, Lin
    NATURE METHODS, 2023, 20 (06) : 784 - 784
  • [2] Assembly-free phylogenetic trees
    Lin Tang
    Nature Methods, 2023, 20 : 784 - 784
  • [3] ASSEMBLY-FREE BUCKLING ANALYSIS FOR TOPOLOGY OPTIMIZATION
    Bian, Xiang
    Yadav, Praveen
    Suresh, Krishnan
    INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2015, VOL 1A, 2016,
  • [4] A cell array fabricated by assembly-free multidirectional photolithography
    Suzuki T.
    Journal of Japan Institute of Electronics Packaging, 2010, 13 (03) : 194 - 199
  • [5] TOWARDS ASSEMBLY-FREE METHODS FOR ADDITIVE MANUFACTURING SIMULATION
    Krishnakumar, Anirudh
    Suresh, Krishnan
    Chandrasekar, Aaditya
    INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2015, VOL 1A, 2016,
  • [6] Assembly-Free and Alignment-Free Sample Identification Using Genome Skims
    Sarmashghi, Shahab
    Bohmann, Kristine
    Gilbert, M. Thomas P.
    Bafna, Vineet
    Mirarab, Siavash
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2018, 2018, 10812 : 276 - 277
  • [7] Assembly-free Tunable Resonator on TE011 Mode
    Perigaud, Auralien
    Delhote, Nicolas
    Tantot, Olivier
    Verdeyme, Serge
    Bila, Stephane
    Ludovic, Carpentier
    2019 IEEE MTT-S INTERNATIONAL MICROWAVE WORKSHOP SERIES ON ADVANCED MATERIALS AND PROCESSES FOR RF AND THZ APPLICATIONS (IMWS-AMP 2019), 2019, : 1 - 3
  • [8] Assembly-free discovery of human novel sequences using long reads
    Li, Qiuhui
    Yan, Bin
    Lam, Tak-Wah
    Luo, Ruibang
    DNA RESEARCH, 2022, 29 (06)
  • [9] Skmer: assembly-free and alignment-free sample identification using genome skims
    Shahab Sarmashghi
    Kristine Bohmann
    M. Thomas P. Gilbert
    Vineet Bafna
    Siavash Mirarab
    Genome Biology, 20
  • [10] Assembly-free reads accurate identification (AFRAID) approach outperforms other methods of DNA barcoding in the walnut family (Juglandaceae)
    Liu, Yanlei
    Chen, Kai
    Wang, Lihu
    Yu, Xinqiang
    Xu, Chao
    Suo, Zhili
    Zhou, Shiliang
    Shi, Shuo
    Dong, Wenpan
    PLANT DIVERSITY, 2025, 47 (01) : 115 - 126