Detecting Sample Misidentifications in Genetic Association Studies

被引:3
|
作者
Ekstrom, Claus T. [1 ]
Feenstra, Bjarke [2 ]
机构
[1] Univ So Denmark, Fac Hlth Sci, Odense, Denmark
[2] State Serum Inst, Dept Epidemiol Res, Copenhagen, Denmark
基金
英国惠康基金; 新加坡国家研究基金会;
关键词
error detection; genome-wide association studies; known genotype-phenotype associations; outlier detection; QUALITY-CONTROL; POWER;
D O I
10.1515/1544-6115.1772
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genetic association studies require that the genotype data from a given person can be correctly linked to the phenotype data from the same person. However, sample misidentification errors sometimes happen, whereby the link becomes invalid for some of the subjects in a study. This can have substantial consequences in terms of power to detect truly associated variants. In family-based studies, Mendelian inconsistencies can be used to detect sample misidentification. Genome-wide association studies (GWAS), however, typically use unrelated individuals, making error detection more problematic. Here we present a method for identifying potential sample misidentifications in GWAS and other genetic association studies building on ideas from forensic sciences. A widely used ad-hoc method for error detection is to check if the sex of an individual matches its X-linked genotype. We generalize this idea to less stringent associations between known genotypes and phenotypes, and show that if several known associations are combined, the power to detect misidentifications increases substantially. Individuals with an unlikely set of phenotypes given their genotypes are flagged as potential errors. We provide analytical and simulation results comparing the odds that the genotype and phenotype are both from the same individual for different numbers of available genotype-phenotype associations and for different information content of the associations. Our method has good sensitivity and specificity with as few as ten moderately informative genotype-phenotype associations. We apply the method to GWAS data from the Danish National Birth Cohort.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Mortality selection in a genetic sample and implications for association studies
    Domingue, Benjamin W.
    Belsky, Daniel W.
    Harrati, Amal
    Conley, Dalton
    Weir, David R.
    Boardman, Jason D.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2017, 46 (04) : 1285 - 1294
  • [2] Bayesian neural networks for detecting epistasis in genetic association studies
    Beam, Andrew L.
    Motsinger-Reif, Alison
    Doyle, Jon
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [3] Bayesian neural networks for detecting epistasis in genetic association studies
    Andrew L Beam
    Alison Motsinger-Reif
    Jon Doyle
    [J]. BMC Bioinformatics, 15
  • [4] Tag jumps illuminated - reducing sequence-to-sample misidentifications in metabarcoding studies
    Schnell, Ida Baerholm
    Bohmann, Kristine
    Gilbert, M. Thomas P.
    [J]. MOLECULAR ECOLOGY RESOURCES, 2015, 15 (06) : 1289 - 1303
  • [5] Small Sample Kernel Association Tests for Human Genetic and Microbiome Association Studies
    Chen, Jun
    Chen, Wenan
    Zhao, Ni
    Wu, Michael C.
    Schaid, Daniel J.
    [J]. GENETIC EPIDEMIOLOGY, 2016, 40 (01) : 5 - 19
  • [6] Sample Size Calculation in Genetic Association Studies: A Practical Approach
    Politi, Cristina
    Roumeliotis, Stefanos
    Tripepi, Giovanni
    Spoto, Belinda
    [J]. LIFE-BASEL, 2023, 13 (01):
  • [7] A model-free approach for detecting interactions in genetic association studies
    Li, Jiahan
    Dan, Jun
    Li, Chunlei
    Wu, Rongling
    [J]. BRIEFINGS IN BIOINFORMATICS, 2014, 15 (06) : 1057 - 1068
  • [8] Improving strategies for detecting genetic patterns of disease susceptibility in association studies
    Calle, M. L.
    Urrea, V.
    Vellalta, G.
    Malats, N.
    Steen, K. V.
    [J]. STATISTICS IN MEDICINE, 2008, 27 (30) : 6532 - 6546
  • [9] Power and Sample Size Calculations for Genetic Association Studies in the Presence of Genetic Model Misspecification
    Moore, Camille M.
    Jacobson, Sean A.
    Fingerlin, Tasha E.
    [J]. HUMAN HEREDITY, 2020, 84 (06) : 256 - 271
  • [10] Detecting genetic association in case-control studies using similarity-based association tests
    Zhang, SL
    Kidd, KK
    Zhao, HY
    [J]. STATISTICA SINICA, 2002, 12 (01) : 337 - 359