Evaluation of polygenic prediction methodology within a reference-standardized framework

被引:88
|
作者
Pain, Oliver [1 ,2 ]
Glanville, Kylie P. [1 ]
Hagenaars, Saskia P. [1 ]
Selzam, Saskia [1 ]
Furtjes, Anna E. [1 ]
Gaspar, Helena A. [1 ]
Coleman, Jonathan R., I [1 ]
Rimfeld, Kaili [1 ]
Breen, Gerome [1 ,2 ]
Plomin, Robert [1 ]
Folkersen, Lasse [3 ]
Lewis, Cathryn M. [1 ,2 ,4 ]
机构
[1] Kings Coll London, Social Genet & Dev Psychiat Ctr, Inst Psychiat, London, England
[2] South London & Maudsley NHS Trust, NIHR Maudsley Biomed Res Ctr, London, England
[3] Sankt Hans Hosp, Inst Biol Psychiat, Copenhagen, Denmark
[4] Kings Coll London, Fac Life Sci & Med, Dept Med & Mol Genet, London, England
来源
PLOS GENETICS | 2021年 / 17卷 / 05期
基金
英国医学研究理事会;
关键词
SCORES; HERITABILITY;
D O I
10.1371/journal.pgen.1009021
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value threshold and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods. Author summary An individual's genetic predisposition to a given outcome can be summarized using polygenic scores. Polygenic scores are widely used in research and could also be used in a clinical setting to enhance personalized medicine. A range of methods have been developed for calculating polygenic scores, but it is unclear which methods are the best. Several methods provide multiple polygenic scores for each individual which must then be tested in an independent tuning sample to identify which polygenic score is most accurate. Other methods provide a single polygenic score and therefore do not require a tuning sample. Our study compares the prediction accuracy of eight leading polygenic scoring methods in a range of contexts. For methods that calculate multiple polygenic scores, we find that LDpred2, lassosum, and PRScs methods perform best on average. For methods that provide a single polygenic score, not requiring a tuning sample, we find PRScs performs best, and the faster DBSLMM and SBayesR methods also perform well. Our study has provided a comprehensive comparison of polygenic scoring methods that will guide future implementation of polygenic scores in both research and clinical settings.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Evaluation of intense pulsed light devices using standardized methodology
    Town, G.
    LASERS IN SURGERY AND MEDICINE, 2007, : 14 - 14
  • [22] Some recent developments in reference methodology within the United Kingdom
    T. Catterick
    D. Craston
    B. King
    R. F. Walker
    K. S. Webb
    Accreditation and Quality Assurance, 1999, 4 : 3 - 13
  • [23] Some recent developments in reference methodology within the United Kingdom
    Catterick, T
    Craston, D
    King, B
    Walker, RF
    Webb, KS
    ACCREDITATION AND QUALITY ASSURANCE, 1999, 4 (1-2) : 3 - 13
  • [24] Using reference-fixed principal components to improve polygenic risk score prediction
    Naret, O.
    Fellay, J.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2020, 28 (SUPPL 1) : 134 - 135
  • [25] Evaluation of Polygenic Risk Score for Prediction of Childhood Onset and Severity of Asthma
    Savelieva, Olga
    Karunas, Alexandra
    Prokopenko, Inga
    Balkhiyarova, Zhanna
    Gilyazova, Irina
    Khidiyatova, Irina
    Khusnutdinova, Elza
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2025, 26 (01)
  • [26] Evaluation of Polygenic Risk Scores for Prediction of Prostate Cancer in Korean Men
    Oh, Jong Jin
    Kim, Eunae
    Woo, Eunjin
    Song, Sang Hun
    Kim, Jung Kwon
    Lee, Hakmin
    Lee, Sangchul
    Hong, Sung Kyu
    Byun, Seok-Soo
    FRONTIERS IN ONCOLOGY, 2020, 10
  • [27] Framework and Evaluation Methodology for Autonomous Drone Racing
    Fernandez-Cortizas, Miguel
    Perez-Saura, David
    Santamaria, Pablo
    Rodriguez-Vazquez, Javier
    Molina, Martin
    Campoy, Pascual
    UNMANNED SYSTEMS, 2022, 10 (04) : 355 - 367
  • [28] Generic Evaluation Framework for Games Development Methodology
    Al-azawi, Rula
    Ayesh, Aladdin
    Al Obaidy, Mohaned
    2013 THIRD INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND INFORMATION TECHNOLOGY (ICCIT), 2013, : 55 - 60
  • [29] ELEMENTS FOR A METHODOLOGY OF COMPARISON WITHIN THE FRAMEWORK OF AN INTERCULTURAL APPROACH TO CIVILIZATION
    CESPEDES, C
    MODERNA SPRAK, 1992, 86 (01): : 48 - 58
  • [30] Transport project assessment methodology within the framework of sustainable development
    Joumard, Robert
    Nicolas, Jean-Pierre
    ECOLOGICAL INDICATORS, 2010, 10 (02) : 136 - 142