Federated Principal Component Analysis for Genome-Wide Association Studies

被引:8
|
作者
Hartebrodt, Anne [1 ]
Nasirigerdeh, Reza [2 ]
Blumenthal, David B. [3 ]
Rottger, Richard [1 ]
机构
[1] Univ Southern Denmark, Odense, Denmark
[2] Tech Univ Munich, Munich, Germany
[3] Friedrich Alexander Univ Erlangen Nurnberg, Erlangen, Germany
关键词
ALGORITHMS;
D O I
10.1109/ICDM51629.2021.00127
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning (FL) has emerged as a privacy-aware alternative to centralized data analysis, especially for biomedical analyses such as genome-wide association studies (GWAS). The data remains with the owner, which enables studies previously impossible due to privacy protection regulations. Principal component analysis (PCA) is a frequent preprocessing step in GWAS, where the eigenvectors of the sample-by-sample covariance matrix are used as covariates in the statistical tests. Therefore, a federated version of PCA suitable for vertical data partitioning is required for federated GWAS. Existing federated PCA algorithms exchange the complete sample eigenvectors, a potential privacy breach. In this paper, we present a federated PCA algorithm for vertically partitioned data which does not exchange the sample eigenvectors and is hence suitable for federated GWAS. We show that it outperforms existing federated solutions in terms of convergence behavior and scalability. Additionally, we provide a user-friendly privacy-aware web tool to promote acceptance of federated PCA among GWAS researchers.
引用
收藏
页码:1090 / 1095
页数:6
相关论文
共 50 条
  • [1] Principal Component Analysis Characterizes Shared Pathogenetics from Genome-Wide Association Studies
    Chang, Diana
    Keinan, Alon
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (09)
  • [2] Supervised categorical principal component analysis for genome-wide association analyses
    Meng Lu
    Hye-Seung Lee
    David Hadley
    Jianhua Z Huang
    Xiaoning Qian
    BMC Genomics, 15
  • [3] Supervised categorical principal component analysis for genome-wide association analyses
    Lu, Meng
    Lee, Hye-Seung
    Hadley, David
    Huang, Jianhua Z.
    Qian, Xiaoning
    BMC GENOMICS, 2014, 15 : 1 - 10
  • [4] Maximizing the Power of Principal-Component Analysis of Correlated Phenotypes in Genome-wide Association Studies
    Aschard, Hugues
    Vilhjalmsson, Bjarni J.
    Greliche, Nicolas
    Morange, Pierre-Emmanuel
    Tregouet, David-Alexandre
    Kraft, Peter
    AMERICAN JOURNAL OF HUMAN GENETICS, 2014, 94 (05) : 662 - 676
  • [5] Sparse Principal Component Analysis for Identifying Ancestry-Informative Markers in Genome-Wide Association Studies
    Lee, Seokho
    Epstein, Michael P.
    Duncan, Richard
    Lin, Xihong
    GENETIC EPIDEMIOLOGY, 2012, 36 (04) : 293 - 302
  • [6] Principal components analysis corrects for stratification in genome-wide association studies
    Alkes L Price
    Nick J Patterson
    Robert M Plenge
    Michael E Weinblatt
    Nancy A Shadick
    David Reich
    Nature Genetics, 2006, 38 : 904 - 909
  • [7] Principal components analysis corrects for stratification in genome-wide association studies
    Price, Alkes L.
    Patterson, Nick J.
    Plenge, Robert M.
    Weinblatt, Michael E.
    Shadick, Nancy A.
    Reich, David
    NATURE GENETICS, 2006, 38 (08) : 904 - 909
  • [8] A Bayesian functional principal component analysis framework for genome-wide association with longitudinal outcomes
    Temko, Daniel
    Nolan, Tui H.
    Richardson, Sylvia
    Ruffieux, Helene
    HUMAN HEREDITY, 2023, 88 (SUPPL 1) : 12 - 12
  • [9] Evaluation of methods for adjusting population stratification in genome-wide association studies: Standard versus categorical principal component analysis
    Turkmen, Asuman S.
    Yuan, Yuan
    Billor, Nedret
    ANNALS OF HUMAN GENETICS, 2019, 83 (06) : 454 - 464
  • [10] An Analysis Pipeline for Genome-wide Association Studies
    Stefanov, Stefan
    Lautenberger, James
    Gold, Bert
    CANCER INFORMATICS, 2008, 6 : 455 - +