Ancestry inference using reference labeled clusters of haplotypes

被引:4
|
作者
Wang, Yong [1 ]
Song, Shiya [1 ]
Schraiber, Joshua G. [1 ]
Sedghifar, Alisa [1 ]
Byrnes, Jake K. [1 ]
Turissini, David A. [1 ]
Hong, Eurie L. [1 ]
Ball, Catherine A. [1 ]
Noto, Keith [1 ]
机构
[1] AncestryDNA, San Francisco, CA 94107 USA
关键词
ARCHes; Ancestry inference; Haplotype modeling; Local ancestry; HMM; RFMix;
D O I
10.1186/s12859-021-04350-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. Results The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. Conclusions Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Ancestry, informative SNPs and haplotypes in Native American populations.
    Kidd, Kenneth K.
    Kidd, Judith R.
    Friedlaender, Francoise
    Pakstis, Andrew J.
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2010, : 141 - 141
  • [42] Fast, accurate local ancestry inference with FLARE
    Browning, Sharon R.
    Waples, Ryan K.
    Browning, Brian L.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2023, 110 (02) : 326 - 335
  • [44] Informativeness of genetic markers for inference of ancestry.
    Rosenberg, NA
    Li, L
    Ward, R
    Pritchard, JK
    AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (05) : 620 - 620
  • [45] A classifier for the SNP-Based inference of ancestry
    Frudakis, T
    Venkateswarlu, K
    Thomas, MJ
    Gaskin, Z
    Ginjupalli, S
    Gunturi, S
    Ponnuswamy, V
    Natarajan, S
    Nachimuthu, PK
    JOURNAL OF FORENSIC SCIENCES, 2003, 48 (04) : 771 - 782
  • [46] Deep Learning Approach to Biogeographical Ancestry Inference
    Qu, Yue
    Tran, Dat
    Ma, Wanli
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 : 552 - 561
  • [47] Rye: genetic ancestry inference at biobank scale
    Conley, Andrew B.
    Rishishwar, Lavanya
    Ahmad, Maria
    Sharma, Shivam
    Norris, Emily T.
    Jordan, I. King
    Marino-Ramirez, Leonardo
    NUCLEIC ACIDS RESEARCH, 2023, 51 (08) : e44
  • [48] Simultaneous inference of haplotypes and alleles at a causal gene
    Larribe, Fabrice
    Dupont, Mathieu J.
    Boucher, Gabrielle
    FRONTIERS IN GENETICS, 2015, 6
  • [49] USING LOCAL ANCESTRY INFERENCE TO IMPROVE POLYGENIC RISK SCORE PREDICTION IN ADMIXED POPULATIONS
    Sampaio, Rafaella
    Mauer, Jessica
    Ito, Lucas
    Antonieto, Julia
    Ota, Vanessa
    Bressan, Rodrigo
    Gadelha, Ary
    Pan, Pedro
    Salum, Giovanni
    Belangero, Sintia
    Santoro, Marcos
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2022, 63 : E173 - E174
  • [50] A Random Forests Framework for Modeling Haplotypes as Mosaics o Reference Haplotypes
    Faux, Pierre
    Geurts, Pierre
    Druet, Tom
    FRONTIERS IN GENETICS, 2019, 10