A method for positive forensic identification of samples from extremely low-coverage sequence data

被引:13
|
作者
Vohr, Samuel H. [1 ]
Najar, Carlos Fernando Buen Abad [2 ]
Shapiro, Beth [3 ]
Green, Richard E. [1 ]
机构
[1] Univ Calif Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[2] Univ Nacl Autonoma Mexico, Fac Ciencias, Mexico City 04510, DF, Mexico
[3] Univ Calif Santa Cruz, Dept Ecol & Evolutionary Biol, Santa Cruz, CA 95064 USA
来源
BMC GENOMICS | 2015年 / 16卷
关键词
Forensics; Ancient DNA; Genomics; GENOME SEQUENCE; DNA; ANCIENT; ENRICHMENT; HAPLOTYPE;
D O I
10.1186/s12864-015-2241-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Determining whether two DNA samples originate from the same individual is difficult when the amount of retrievable DNA is limited. This is often the case for ancient, historic, and forensic samples. The most widely used approaches rely on amplification of a defined panel of multi-allelic markers and comparison to similar data from other samples. When the amount retrievable DNA is low these approaches fail. Results: We describe a new method for assessing whether shotgun DNA sequence data from two samples are consistent with originating from the same or different individuals. Our approach makes use of the large catalogs of single nucleotide polymorphism (SNP) markers to maximize the chances of observing potentially discriminating alleles. We further reduce the amount of data required by taking advantage of patterns of linkage disequilibrium modeled by a reference panel of haplotypes to indirectly compare observations at pairs of linked SNPs. Using both coalescent simulations and real sequencing data from modern and ancient sources, we show that this approach is robust with respect to the reference panel and has power to detect positive identity from DNA libraries with less than 1 % random and non-overlapping genome coverage in each sample. Conclusion: We present a powerful new approach that can determine whether DNA from two samples originated from the same individual even when only minute quantities of DNA are recoverable from each.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A method for positive forensic identification of samples from extremely low-coverage sequence data
    Samuel H. Vohr
    Carlos Fernando Buen Abad Najar
    Beth Shapiro
    Richard E. Green
    BMC Genomics, 16
  • [2] Clonality Inference from Single Tumor Samples Using Low-Coverage Sequence Data
    Donmez, Nilgun
    Malikic, Salem
    Wyatt, Alexander W.
    Gleave, Martin E.
    Collins, Colin C.
    Sahinalp, S. Cenk
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2017, 24 (06) : 515 - 523
  • [3] An accurate assignment test for extremely low-coverage whole-genome sequence data
    Ferrari, Giada
    Atmore, Lane M.
    Jentoft, Sissel
    Jakobsen, Kjetill S.
    Makowiecki, Daniel
    Barrett, James H.
    Star, Bastiaan
    MOLECULAR ECOLOGY RESOURCES, 2022, 22 (04) : 1330 - 1344
  • [4] Kinship Estimation Based on Extremely Low-Coverage Sequencing Data
    Dou, Jinzhuang
    Chothani, Sonia
    Sim, Xueling
    Hughes, Jason D.
    Reilly, Dermot F.
    Tai, E. Shyong
    Liu, Jianjun
    Wang, Chaolong
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 619 - 620
  • [5] Imputing Genotypes in Biallelic Populations from Low-Coverage Sequence Data
    Fragoso, Christopher A.
    Heffelfinger, Christopher
    Zhao, Hongyu
    Dellaporta, Stephen L.
    GENETICS, 2016, 202 (02) : 487 - +
  • [6] A computational approach for positive genetic identification and relatedness detection from low-coverage shotgun sequencing data
    Nguyen, Remy
    Kapp, Joshua D.
    Sacco, Samuel
    Myers, Steven P.
    Green, Richard E.
    JOURNAL OF HEREDITY, 2023, : 504 - 512
  • [7] Fast imputation using medium or low-coverage sequence data
    Paul M. VanRaden
    Chuanyu Sun
    Jeffrey R. O’Connell
    BMC Genetics, 16
  • [8] Fast imputation using medium or low-coverage sequence data
    VanRaden, Paul M.
    Sun, Chuanyu
    O'Connell, Jeffrey R.
    BMC GENETICS, 2015, 16
  • [9] SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples
    Le, Si Quang
    Durbin, Richard
    GENOME RESEARCH, 2011, 21 (06) : 952 - 960
  • [10] Low-Coverage Sequencing Imputation from millions of reference samples
    Rubinacci, Simone
    Delaneau, Olivier
    HUMAN HEREDITY, 2022, VOL. (SUPPL 1) : 4 - 5