Principal Component Regression and Linear Mixed Model in Association Analysis of Structured Samples: Competitors or Complements?

被引:31
|
作者
Zhang, Yiwei [1 ]
Pan, Wei [1 ]
机构
[1] Univ Minnesota, Sch Publ Hlth, Div Biostat, Minneapolis, MN 55455 USA
基金
欧洲研究理事会;
关键词
association testing; confounding; environmental risk; population stratification; probabilistic principal component analysis; POPULATION STRATIFICATION; VARIANTS; SCALE;
D O I
10.1002/gepi.21879
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) have been established as a major tool to identify genetic variants associated with complex traits, such as common diseases. However, GWAS may suffer from false positives and false negatives due to confounding population structures, including known or unknown relatedness. Another important issue is unmeasured environmental risk factors. Among many methods for adjusting for population structures, two approaches stand out: one is principal component regression (PCR) based on principal component analysis, which is perhaps the most popular due to its early appearance, simplicity, and general effectiveness; the other is based on a linear mixed model (LMM) that has emerged recently as perhaps the most flexible and effective, especially for samples with complex structures as in model organisms. As shown previously, the PCR approach can be regarded as an approximation to an LMM; such an approximation depends on the number of the top principal components (PCs) used, the choice of which is often difficult in practice. Hence, in the presence of population structure, the LMM appears to outperform the PCR method. However, due to the different treatments of fixed vs. random effects in the two approaches, we show an advantage of PCR over LMM: in the presence of an unknown but spatially confined environmental confounder (e.g., environmental pollution or lifestyle), the PCs may be able to implicitly and effectively adjust for the confounder whereas the LMM cannot. Accordingly, to adjust for both population structures and nongenetic confounders, we propose a hybrid method combining the use and, thus, strengths of PCR and LMM. We use real genotype data and simulated phenotypes to confirm the above points, and establish the superior performance of the hybrid method across all scenarios.
引用
收藏
页码:149 / 155
页数:7
相关论文
共 50 条
  • [31] Combating outliers and multicollinearity in linear regression model using robust Kibria-Lukman mixed with principal component estimator, simulation and computation
    Arum, K. C.
    Ugwuowo, F. I.
    Oranye, H. E.
    Alakija, T. O.
    Ugah, T. E.
    Asogwa, O. C.
    SCIENTIFIC AFRICAN, 2023, 19
  • [32] Spectral simulation study on the influence of the principal component analysis step on principal component regression
    Hasegawa, T
    APPLIED SPECTROSCOPY, 2006, 60 (01) : 95 - 98
  • [33] Electric Field Strength Calculation of Regression Model Based on Principal Component Analysis
    Li, Zhuoshi
    Wang, Zenghui
    Zhang, Tingting
    Ding, Xiaoqi
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND MANAGEMENT INNOVATION, 2015, 28 : 552 - 556
  • [34] Biresponse nonparametric regression model in principal component analysis with truncated spline estimator
    Islamiyati, Anna
    Kalondeng, Anisa
    Sunusi, Nurtiti
    Zakir, Muhammad
    Amir, Amir Kamal
    JOURNAL OF KING SAUD UNIVERSITY SCIENCE, 2022, 34 (03)
  • [35] Linear Subspace Principal Component Regression Model for Quality Estimation of Nonlinear and Multimode Industrial Processes
    Zhen, Junhua
    Song, Zhihuan
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2017, 56 (21) : 6275 - 6285
  • [36] Evaluation of principal component selection methods to form a global prediction model by principal component regression
    Xie, YL
    Kalivas, JH
    ANALYTICA CHIMICA ACTA, 1997, 348 (1-3) : 19 - 27
  • [37] PRINCIPAL COMPONENT REGRESSION FOR TOBIT MODEL AND PURCHASES OF GOLD
    Alhusseini, Fadel Hamid Hadi
    Odah, Meshal Harbi
    PROCEEDINGS OF THE 10TH INTERNATIONAL MANAGEMENT CONFERENCE: CHALLENGES OF MODERN MANAGEMENT (IMC 2016), 2016, : 491 - 500
  • [38] Examination of criteria for local model principal component regression
    Bakken, GA
    Long, DR
    Kalivas, JH
    APPLIED SPECTROSCOPY, 1997, 51 (12) : 1814 - 1822
  • [39] Analysis of PEM fuel cell experimental data using principal component analysis and multi linear regression
    Placca, Latevi
    Kouta, Raed
    Candusso, Denis
    Blachot, Jean-Francois
    Charon, Willy
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2010, 35 (10) : 4582 - 4591
  • [40] Fast Algorithms for Structured Robust Principal Component Analysis
    Ayazoglu, Mustafa
    Sznaier, Mario
    Camps, Octavia I.
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 1704 - 1711