Fast model-based estimation of ancestry in unrelated individuals

被引:5581
|
作者
Alexander, David H. [1 ]
Novembre, John [2 ]
Lange, Kenneth [3 ,4 ]
机构
[1] Univ Calif Los Angeles, Dept Biomath, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Ecol & Evolutionary Biol, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
关键词
MULTILOCUS GENOTYPE DATA; EM ALGORITHM; POPULATION-STRUCTURE; ADMIXED POPULATIONS; GENETIC ASSOCIATION; INFERENCE; ADMIXTURE; GENOME; ACCELERATION; HAPLOTYPE;
D O I
10.1101/gr.094052.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multi-locus genotype data, can be used to perform a statistical correction for population stratification. One popular technique for estimation of ancestry is the model-based approach embodied by the widely applied program structure. Another approach, implemented in the program EIGENSTRAT, relies on Principal Component Analysis rather than model-based estimation and does not directly deliver admixture fractions. EIGENSTRAT has gained in popularity in part owing to its remarkable speed in comparison to structure. We present a new algorithm and a program, ADMIXTURE, for model-based estimation of ancestry in unrelated individuals. ADMIXTURE adopts the likelihood model embedded in structure. However, ADMIXTURE runs considerably faster, solving problems in minutes that take structure hours. In many of our experiments, we have found that ADMIXTURE is almost as fast as EIGENSTRAT. The runtime improvements of ADMIXTURE rely on a fast block relaxation scheme using sequential quadratic programming for block updates, coupled with a novel quasi-Newton acceleration of convergence. Our algorithm also runs faster and with greater accuracy than the implementation of an Expectation-Maximization ( EM) algorithm incorporated in the program FRAPPE. Our simulations show that ADMIXTURE's maximum likelihood estimates of the underlying admixture coefficients and ancestral allele frequencies are as accurate as structure's Bayesian estimates. On real-world data sets, ADMIXTURE's estimates are directly comparable to those from structure and EIGENSTRAT. Taken together, our results show that ADMIXTURE's computational speed opens up the possibility of using a much larger set of markers in model-based ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies.
引用
收藏
页码:1655 / 1664
页数:10
相关论文
共 50 条
  • [1] Histogram ordering model-based fast motion estimation
    Park, S. -J.
    Hong, S. -M.
    Lee, H.
    Jin, S.
    Jeong, J.
    [J]. IET IMAGE PROCESSING, 2012, 6 (03) : 238 - 250
  • [2] Model-based genotype and ancestry estimation for potential hybrids with mixed-ploidy
    Shastry, Vivaswat
    Adams, Paula E.
    Lindtke, Dorothea
    Mandeville, Elizabeth G.
    Parchman, Thomas L.
    Gompert, Zachariah
    Buerkle, C. Alex
    [J]. MOLECULAR ECOLOGY RESOURCES, 2021, 21 (05) : 1434 - 1451
  • [3] Computation of ancestry scores with mixed families and unrelated individuals
    Zhou, Yi-Hui
    Marron, James S.
    Wright, Fred A.
    [J]. BIOMETRICS, 2018, 74 (01) : 155 - 164
  • [4] Fast Optimization of Hairpin Filters Using Model-Based Deviation Estimation
    He, Xuan
    Xu, Xi-Qing
    Zhou, Jian-Yi
    [J]. 2020 IEEE ASIA-PACIFIC MICROWAVE CONFERENCE (APMC), 2020, : 1045 - 1047
  • [5] Fast model-based migration velocity analysis and reflector shape estimation
    Fei, WH
    McMechan, GA
    [J]. GEOPHYSICS, 2005, 70 (02) : U9 - U17
  • [6] Fast Frequency Regulation in Islanded Microgrid Using Model-Based Load Estimation
    Hussain, Amir
    Hasan, Shamim
    Patil, Sumit
    Shireen, Wajiha
    [J]. IEEE TRANSACTIONS ON ENERGY CONVERSION, 2021, 36 (04) : 3188 - 3198
  • [7] Fast model-based penetration testing
    Singh, S
    Lyons, J
    Nicol, DM
    [J]. PROCEEDINGS OF THE 2004 WINTER SIMULATION CONFERENCE, VOLS 1 AND 2, 2004, : 309 - 317
  • [8] Fast model-based ordination with copulas
    Popovic, Gordana C.
    Hui, Francis K. C.
    Warton, David, I
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (01): : 194 - 202
  • [9] A Fast Model-Based Diagnosis Engine
    Fijany, Amir
    Barrett, Anthony C.
    Vatan, Farrokh
    [J]. 2012 IEEE AEROSPACE CONFERENCE, 2012,
  • [10] Craniometric estimation of ancestry in Thai and Japanese individuals
    Kongkasuriyachai, Natthamon Pureepatpong
    Prasitwattanaseree, Sukon
    Case, D. Troy
    Mahakkanukrauh, Pasuk
    [J]. AUSTRALIAN JOURNAL OF FORENSIC SCIENCES, 2022, 54 (03) : 294 - 310