Participant identification in genetic association studies: improved methods and practical implications

被引:13
|
作者
Masca, Nicholas [1 ]
Burton, Paul R. [1 ]
Sheehan, Nuala A. [1 ]
机构
[1] Univ Leicester, Dept Hlth Sci, Leicester LE1 7RH, Leics, England
基金
英国医学研究理事会; 英国惠康基金;
关键词
Identification; linear regression; generalized estimating equations; linkage disequilibrium; case-control genetic association studies; GENOME-WIDE ASSOCIATION; ESTIMATING EQUATIONS; LONGITUDINAL DATA; POPULATIONS; PRIVACY; MIXTURE; DISEASE; MODELS;
D O I
10.1093/ije/dyr149
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background In a recent paper by Homer et al. (Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 2008; 4: e1000167), a method for detecting whether a given individual is a contributor to a particular genomic mixture was proposed. This prompted grave concern about the public dissemination of aggregate statistics from genome-wide association studies. It is of clear scientific importance that such data be shared widely, but the confidentiality of study participants must not be compromised. The issue of what summary genomic data can safely be posted on the web is only addressed satisfactorily when the theoretical underpinnings of the proposed method are clarified and its performance evaluated in terms of dependence on underlying assumptions. Methods The original method raised a number of concerns and several alternatives have since been proposed, including a simple linear regression approach. In our proposed generalized estimating equation approach, we maintain the simplicity of the linear regression model but obtain inferences that are more robust to approximation of the variance/covariance structure and can accommodate linkage disequilibrium. Results We affirm that, in principle, it is possible to determine that a 'candidate' individual has participated in a study, given a subset of aggregate statistics from that study. However, the methods depend critically on a number of key factors including: the ancestry of participants in the study; the absolute and relative numbers of cases and controls; and the number of single nucleotide polymorphisms. Conclusions Simple guidelines for publication that are based on a single criterion are therefore unlikely to suffice. In particular, 'directed' summary statistics should not be posted openly on the web but could be protected by an internet-based access check as proposed by the P3G_Consortium et al. (Public access to genome-wide data: five views on balancing research with privacy and protection. PLoS Genet 2009;5:e1000665).
引用
收藏
页码:1629 / 1642
页数:14
相关论文
共 50 条
  • [1] An Improved Score Test for Genetic Association Studies
    Sha, Qiuying
    Zhang, Zhaogong
    Zhang, Shuanglin
    [J]. GENETIC EPIDEMIOLOGY, 2011, 35 (05) : 350 - 359
  • [2] Bayesian statistical methods for genetic association studies
    Matthew Stephens
    David J. Balding
    [J]. Nature Reviews Genetics, 2009, 10 : 681 - 690
  • [3] Statistical methods for the analysis of genetic association studies
    Zou, GY
    [J]. ANNALS OF HUMAN GENETICS, 2006, 70 : 262 - 276
  • [4] Bayesian statistical methods for genetic association studies
    Stephens, Matthew
    Balding, David J.
    [J]. NATURE REVIEWS GENETICS, 2009, 10 (10) : 681 - 690
  • [5] A review of kernel methods for genetic association studies
    Larson, Nicholas B.
    Chen, Jun
    Schaid, Daniel J.
    [J]. GENETIC EPIDEMIOLOGY, 2019, 43 (02) : 122 - 136
  • [6] A Comparison of Analytical Methods for Genetic Association Studies
    Motsinger-Reif, Alison A.
    Reif, David M.
    Fanelli, Theresa J.
    Ritchie, Marylyn D.
    [J]. GENETIC EPIDEMIOLOGY, 2008, 32 (08) : 767 - 778
  • [7] IMPLICATIONS OF PARTICIPANT OBSERVATION IN MEDICAL STUDIES
    MURRAY, WB
    BUCKINGHAM, RW
    [J]. CANADIAN MEDICAL ASSOCIATION JOURNAL, 1976, 115 (12) : 1187 - &
  • [8] Sample Size Calculation in Genetic Association Studies: A Practical Approach
    Politi, Cristina
    Roumeliotis, Stefanos
    Tripepi, Giovanni
    Spoto, Belinda
    [J]. LIFE-BASEL, 2023, 13 (01):
  • [9] Mortality selection in a genetic sample and implications for association studies
    Domingue, Benjamin W.
    Belsky, Daniel W.
    Harrati, Amal
    Conley, Dalton
    Weir, David R.
    Boardman, Jason D.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2017, 46 (04) : 1285 - 1294
  • [10] Cancer heterogeneity: origins and implications for genetic association studies
    Urbach, Davnah
    Lupien, Mathieu
    Karagas, Margaret R.
    Moore, Jason H.
    [J]. TRENDS IN GENETICS, 2012, 28 (11) : 538 - 543