Ancestry inference using reference labeled clusters of haplotypes

被引:4
|
作者
Wang, Yong [1 ]
Song, Shiya [1 ]
Schraiber, Joshua G. [1 ]
Sedghifar, Alisa [1 ]
Byrnes, Jake K. [1 ]
Turissini, David A. [1 ]
Hong, Eurie L. [1 ]
Ball, Catherine A. [1 ]
Noto, Keith [1 ]
机构
[1] AncestryDNA, San Francisco, CA 94107 USA
关键词
ARCHes; Ancestry inference; Haplotype modeling; Local ancestry; HMM; RFMix;
D O I
10.1186/s12859-021-04350-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. Results The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. Conclusions Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Ancestry inference using reference labeled clusters of haplotypes
    Yong Wang
    Shiya Song
    Joshua G. Schraiber
    Alisa Sedghifar
    Jake K. Byrnes
    David A. Turissini
    Eurie L. Hong
    Catherine A. Ball
    Keith Noto
    BMC Bioinformatics, 22
  • [2] Improved ancestry inference using weights from external reference panels
    Chen, Chia-Yen
    Pollack, Samuela
    Hunter, David J.
    Hirschhorn, Joel N.
    Kraft, Peter
    Price, Alkes L.
    BIOINFORMATICS, 2013, 29 (11) : 1399 - 1406
  • [3] Mini-haplotypes as lineage informative SNPs and ancestry inference SNPs
    Pakstis, Andrew J.
    Fang, Rixun
    Furtado, Manohar R.
    Kidd, Judith R.
    Kidd, Kenneth K.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2012, 20 (11) : 1148 - 1154
  • [4] Mini-haplotypes as lineage informative SNPs and ancestry inference SNPs
    Andrew J Pakstis
    Rixun Fang
    Manohar R Furtado
    Judith R Kidd
    Kenneth K Kidd
    European Journal of Human Genetics, 2012, 20 : 1148 - 1154
  • [5] Ancestry inference using machine learning
    Tang, Lin
    NATURE METHODS, 2023, 20 (09) : 1274 - 1274
  • [6] Ancestry inference using machine learning
    Lin Tang
    Nature Methods, 2023, 20 : 1274 - 1274
  • [7] A statistical model for reference-free inference of archaic local ancestry
    Durvasula, Arun
    Sankararaman, Sriram
    PLOS GENETICS, 2019, 15 (05):
  • [8] Assessing the limits of local ancestry inference from small reference panels
    Oliveira, Sandra
    Marchi, Nina
    Excoffier, Laurent
    MOLECULAR ECOLOGY RESOURCES, 2024, 24 (06)
  • [9] Inference of ancestry: Constructing hierarchical reference populations and assigning unknown individuals
    Ekins J.E.
    Ekins J.B.
    Layton L.
    Hutchison L.A.D.
    Myres N.M.
    Woodward S.R.
    Human Genomics, 2 (4) : 212 - 235
  • [10] Inference of admixed ancestry with Ancestry Informative Markers
    Tvedebrink, Torben
    Eriksen, Poul Svante
    FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2019, 42 : 147 - 153