Population Substructure and Control Selection in Genome-Wide Association Studies

被引:96
|
作者
Yu, Kai [1 ]
Wang, Zhaoming [1 ,2 ]
Li, Qizhai [1 ,3 ]
Wacholder, Sholom [1 ]
Hunter, David J. [1 ,4 ]
Hoover, Robert N. [1 ]
Chanock, Stephen [1 ]
Thomas, Gilles [1 ]
机构
[1] NCI, Div Canc Epidemiol & Genet, Rockville, MD USA
[2] Natl Canc Inst Frederick, SAIC Frederick Inc, Advanced Technol Program, Core Genotyping Facility, Frederick, MD USA
[3] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[4] Harvard Sch Publ Hlth, Dept Epidemiol, Program Mol & Genet Epidemiol, Boston, MA USA
来源
PLOS ONE | 2008年 / 3卷 / 07期
关键词
D O I
10.1371/journal.pone.0002551
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification ( PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS ( inflation factor lambda of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (lambda of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r(2)< 0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to lambda of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Accounting for ancestry: population substructure and genome-wide association studies
    Tian, Chao
    Gregersen, Peter K.
    Seldin, Michael F.
    [J]. HUMAN MOLECULAR GENETICS, 2008, 17 : R143 - R150
  • [2] Control Selection Options for Genome-Wide Association Studies in Cohorts
    Wacholder, Sholom
    Rotunno, Melissa
    [J]. CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2009, 18 (03) : 695 - 697
  • [3] Model Selection Strategies in Genome-Wide Association Studies
    Keildson, Sarah L.
    Farrall, Martin
    Morris, Andrew P.
    [J]. GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 792 - 792
  • [4] Bayesian Variable Selection with Genome-wide Association Studies
    Bangchang, Kannat Na
    [J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2024, 45 (02) : 613 - 620
  • [5] A variable selection method for genome-wide association studies
    He, Qianchuan
    Lin, Dan-Yu
    [J]. BIOINFORMATICS, 2011, 27 (01) : 1 - 8
  • [6] Genome-wide association studies in Japanese rice population
    Yamasaki, Masanori
    Iwata, Hiroyoshi
    Yoshioka, Takuma
    Ideta, Osamu
    Shibaya, Taeko
    Yamanouchi, Utako
    Hori, Kiyosumi
    Nagasaki, Hideki
    Ebana, Kaworu
    [J]. GENES & GENETIC SYSTEMS, 2011, 86 (06) : 393 - 393
  • [7] False discovery rate control in genome-wide association studies with population structure
    Sesia, Matteo
    Bates, Stephen
    Candes, Emmanuel
    Marchini, Jonathan
    Sabatti, Chiara
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (40)
  • [8] Genome-Wide Association Studies: Quality Control and Population-Based Measures
    Ziegler, Andreas
    [J]. GENETIC EPIDEMIOLOGY, 2009, 33 : S45 - S50
  • [9] Genetic Variation and Population Substructure in Outbred CD-1 Mice: Implications for Genome-Wide Association Studies
    Aldinger, Kimberly A.
    Sokoloff, Greta
    Rosenberg, David M.
    Palmer, Abraham A.
    Millen, Kathleen J.
    [J]. PLOS ONE, 2009, 4 (03):
  • [10] USE OF GENOME-WIDE ASSOCIATION STUDIES IN SELECTION OF CANDIDATE SNPS
    Bolton, J.
    Price, J.
    [J]. ATHEROSCLEROSIS SUPPLEMENTS, 2009, 10 (02)