Evaluation of polygenic prediction methodology within a reference-standardized framework

被引:88
|
作者
Pain, Oliver [1 ,2 ]
Glanville, Kylie P. [1 ]
Hagenaars, Saskia P. [1 ]
Selzam, Saskia [1 ]
Furtjes, Anna E. [1 ]
Gaspar, Helena A. [1 ]
Coleman, Jonathan R., I [1 ]
Rimfeld, Kaili [1 ]
Breen, Gerome [1 ,2 ]
Plomin, Robert [1 ]
Folkersen, Lasse [3 ]
Lewis, Cathryn M. [1 ,2 ,4 ]
机构
[1] Kings Coll London, Social Genet & Dev Psychiat Ctr, Inst Psychiat, London, England
[2] South London & Maudsley NHS Trust, NIHR Maudsley Biomed Res Ctr, London, England
[3] Sankt Hans Hosp, Inst Biol Psychiat, Copenhagen, Denmark
[4] Kings Coll London, Fac Life Sci & Med, Dept Med & Mol Genet, London, England
来源
PLOS GENETICS | 2021年 / 17卷 / 05期
基金
英国医学研究理事会;
关键词
SCORES; HERITABILITY;
D O I
10.1371/journal.pgen.1009021
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value threshold and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods. Author summary An individual's genetic predisposition to a given outcome can be summarized using polygenic scores. Polygenic scores are widely used in research and could also be used in a clinical setting to enhance personalized medicine. A range of methods have been developed for calculating polygenic scores, but it is unclear which methods are the best. Several methods provide multiple polygenic scores for each individual which must then be tested in an independent tuning sample to identify which polygenic score is most accurate. Other methods provide a single polygenic score and therefore do not require a tuning sample. Our study compares the prediction accuracy of eight leading polygenic scoring methods in a range of contexts. For methods that calculate multiple polygenic scores, we find that LDpred2, lassosum, and PRScs methods perform best on average. For methods that provide a single polygenic score, not requiring a tuning sample, we find PRScs performs best, and the faster DBSLMM and SBayesR methods also perform well. Our study has provided a comprehensive comparison of polygenic scoring methods that will guide future implementation of polygenic scores in both research and clinical settings.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Evaluation of polygenic prediction methodology within a reference-standardized framework
    Pain, Oliver
    Glanville, Kylie P.
    Hagenaars, Saskia
    Selzam, Saskia
    Furtjes, Anna E.
    Gaspar, Helena
    Coleman, Jonathan R. I.
    Rimfeld, Kaili
    Breen, Gerome
    Plomin, Robert
    Folkersen, Lasse
    Lewis, Cathryn M.
    BEHAVIOR GENETICS, 2020, 50 (06) : 473 - 474
  • [2] Standardized evaluation methodology and reference database for evaluating IVUS image segmentation
    Balocco, Simone
    Gatta, Carlo
    Ciompi, Francesco
    Wahle, Andreas
    Radeva, Petia
    Carlier, Stephane
    Unal, Gozde
    Sanidas, Elias
    Mauri, Josepa
    Carillo, Xavier
    Kovarnik, Tomas
    Wang, Ching-Wei
    Chen, Hsiang-Chou
    Exarchos, Themis P.
    Fotiadis, Dimitrios I.
    Destrempes, Francois
    Cloutier, Guy
    Pujol, Oriol
    Alberti, Marina
    Mendizabal-Ruiz, E. Gerardo
    Rivera, Mariano
    Aksoy, Timur
    Downe, Richard W.
    Kakadiaris, Ioannis A.
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2014, 38 (02) : 70 - 90
  • [3] Standardized evaluation methodology and reference database for evaluating coronary artery centerline extraction algorithms
    Schaap, Michiel
    Metz, Coert T.
    van Walsum, Theo
    van der Giessen, Alina G.
    Weustink, Annick C.
    Mollet, Nico R.
    Bauer, Christian
    Bogunovic, Hrvoje
    Castro, Carlos
    Deng, Xiang
    Dikici, Engin
    O'Donnell, Thomas
    Frenay, Michel
    Friman, Ola
    Hernandez Hoyos, Marcela
    Kitslaar, Pieter H.
    Krissian, Karl
    Kuehnel, Caroline
    Luengo-Oroz, Miguel A.
    Orkisz, Maciej
    Smedby, Orjan
    Styner, Martin
    Szymczak, Andrzej
    Tek, Hueseyin
    Wang, Chunliang
    Warfield, Simon K.
    Zambal, Sebastian
    Zhang, Yong
    Krestin, Gabriel P.
    Niessen, Wiro J.
    MEDICAL IMAGE ANALYSIS, 2009, 13 (05) : 701 - 714
  • [4] Toward a standardized evaluation of imputation methodology
    Oberman, Hanne I.
    Vink, Gerko
    BIOMETRICAL JOURNAL, 2024, 66 (01)
  • [5] Genetic determinants of polygenic prediction accuracy within a population
    Lu, Tianyuan
    Forgetta, Vincenzo
    Richards, J. Brent
    Greenwood, Celia M. T.
    GENETICS, 2022, 222 (04)
  • [6] ROC methodology within a monitoring framework
    Parker, CB
    DeLong, ER
    STATISTICS IN MEDICINE, 2003, 22 (22) : 3473 - 3488
  • [7] Variable prediction accuracy of polygenic scores within an ancestry group
    Mostafavi, Hakhamanesh
    Harpak, Arbel
    Agarwal, Ipsita
    Conley, Dalton
    Pritchard, Jonathan K.
    Przeworski, Molly
    ELIFE, 2020, 9
  • [8] Polygenic Prediction of Cognitive Abilities Between and Within Family Members
    Morrison, Claire L.
    Keller, Matthew C.
    Reynolds, Chandra A.
    Wadsworth, Sally J.
    Corley, Robin P.
    Friedman, Naomi P.
    BEHAVIOR GENETICS, 2021, 51 (06) : 725 - 726
  • [9] Recommendations for a standardized perimetry within the framework of epilepsy surgery
    Lutz, M. T.
    Mayer, T.
    Schiefer, U.
    OPHTHALMOLOGE, 2011, 108 (07): : 628 - 636
  • [10] Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology
    Wang, Ying
    Kanai, Masahiro
    Tan, Taotao
    Kamariza, Mireille
    Tsuo, Kristin
    Yuan, Kai
    Zhou, Wei
    Okada, Yukinori
    Huang, Hailiang
    Turley, Patrick
    Atkinson, Elizabeth G.
    Martin, Alicia R.
    CELL GENOMICS, 2023, 3 (10):