Effective sample size: Quick estimation of the effect of related samples in genetic case-control association analyses

被引:12
|
作者
Yang, Yaning [2 ]
Remmers, Elaine F. [3 ]
Ogunwole, Chukwuma B. [3 ]
Kastner, Daniel L. [3 ]
Gregersen, Peter K. [1 ]
Li, Wentian [1 ]
机构
[1] N Shore LIJ Hlth Syst, Feinstein Inst Med Res, Robert S Boas Ctr Genom & Human Genet, Manhasset, NY 11030 USA
[2] Univ Sci & Technol China, Dept Stat & Finance, Hefei 230026, Anhui, Peoples R China
[3] NIAMSD, Genet & Genom Branch, NIH, Bethesda, MD 20892 USA
基金
美国国家卫生研究院; 中国国家自然科学基金;
关键词
Genetic association; Correlation; Variance inflation; Effective sample size; GENOME-WIDE ASSOCIATION; COMPLEX HUMAN-DISEASES; CASE-CONTROL DESIGNS; LINKAGE DISEQUILIBRIUM; RHEUMATOID-ARTHRITIS; POPULATION STRATIFICATION; ESTIMATING EQUATIONS; ALLELE FREQUENCIES; CORRELATED DATA; INDIVIDUALS;
D O I
10.1016/j.compbiolchem.2010.12.006
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Affected relatives are essential for pedigree linkage analysis, however, they cause a violation of the independent sample assumption in case-control association studies. To avoid the correlation between samples, a common practice is to take only one affected sample per pedigree in association analysis. Although several methods exist in handling correlated samples, they are still not widely used in part because these are not easily implemented, or because they are not widely known. We advocate the effective sample size method as a simple and accessible approach for case-control association analysis with correlated samples. This method modifies the chi-square test statistic, p-value, and 95% confidence interval of the odds-ratio by replacing the apparent number of allele or genotype counts with the effective ones in the standard formula, without the need for specialized computer programs. We present a simple formula for calculating effective sample size for many types of relative pairs and relative sets. For allele frequency estimation, the effective sample size method captures the variance inflation exactly. For genotype frequency, simulations showed that effective sample size provides a satisfactory approximation. A gene which is previously identified as a type 1 diabetes susceptibility locus, the interferon-induced helicase gene (IFIH1), is shown to be significantly associated with rheumatoid arthritis when the effective sample size method is applied. This significant association is not established if only one affected sib per pedigree were used in the association analysis. Relationship between the effective sample size method and other methods - the generalized estimation equation, variance of eigenvalues for correlation matrices, and genomic controls - are discussed. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:40 / 49
页数:10
相关论文
共 50 条
  • [31] SAMPLE-SIZE DETERMINATION IN CASE-CONTROL STUDIES - RESPONSE
    MCKEOWNEYSSEN, G
    THOMAS, DC
    JOURNAL OF CHRONIC DISEASES, 1987, 40 (12): : 1144 - 1145
  • [32] Data quality control in genetic case-control association studies
    Carl A Anderson
    Fredrik H Pettersson
    Geraldine M Clarke
    Lon R Cardon
    Andrew P Morris
    Krina T Zondervan
    Nature Protocols, 2010, 5 : 1564 - 1573
  • [33] SAMPLE SIZE REQUIREMENTS IN COHORT AND CASE-CONTROL STUDIES OF DISEASE
    SCHLESSELMAN, JJ
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 1974, 99 (06) : 381 - 384
  • [34] Power and sample size calculations in the presence of phenotype errors for case/control genetic association studies
    Edwards, BJ
    Haynes, C
    Levenstien, MA
    Finch, SJ
    Gordon, D
    BMC GENETICS, 2005, 6 (1)
  • [35] Power and sample size calculations in the presence of phenotype errors for case/control genetic association studies
    Brian J Edwards
    Chad Haynes
    Mark A Levenstien
    Stephen J Finch
    Derek Gordon
    BMC Genetics, 6
  • [36] Case-control studies of genetic markers: Power and sample size approximations for Armitage's test for trend
    Slager, SL
    Schaid, DJ
    HUMAN HEREDITY, 2001, 52 (03) : 149 - 153
  • [37] Marker selection for genetic case-control association studies
    Pettersson, Fredrik H.
    Anderson, Carl A.
    Clarke, Geraldine M.
    Barrett, Jeffrey C.
    Cardon, Lon R.
    Morris, Andrew P.
    Zondervan, Krina T.
    NATURE PROTOCOLS, 2009, 4 (05) : 743 - 752
  • [38] Three lectures on case-control genetic association analysis
    Li, Wentian
    BRIEFINGS IN BIOINFORMATICS, 2008, 9 (01) : 1 - 13
  • [39] Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies
    Wei Zhou
    Jonas B. Nielsen
    Lars G. Fritsche
    Rounak Dey
    Maiken E. Gabrielsen
    Brooke N. Wolford
    Jonathon LeFaive
    Peter VandeHaar
    Sarah A. Gagliano
    Aliya Gifford
    Lisa A. Bastarache
    Wei-Qi Wei
    Joshua C. Denny
    Maoxuan Lin
    Kristian Hveem
    Hyun Min Kang
    Goncalo R. Abecasis
    Cristen J. Willer
    Seunggeun Lee
    Nature Genetics, 2018, 50 : 1335 - 1341
  • [40] Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies
    Zhou, Wei
    Nielsen, Jonas B.
    Fritsche, Lars G.
    Dey, Rounak
    Gabrielsen, Maiken E.
    Wolford, Brooke N.
    LeFaive, Jonathon
    VandeHaar, Peter
    Gagliano, Sarah A.
    Gifford, Aliya
    Bastarache, Lisa A.
    Wei, Wei-Qi
    Denny, Joshua C.
    Lin, Maoxuan
    Hveem, Kristian
    Kang, Hyun Min
    Abecasis, Goncalo R.
    Willer, Cristen J.
    Lee, Seunggeun
    NATURE GENETICS, 2018, 50 (09) : 1335 - +