Estimating haplotype frequencies and standard errors for multiple single nucleotide polymorphisms

被引:63
|
作者
Li, SSY
Khalid, N
Carlson, C
Zhao, LP
机构
[1] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98109 USA
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
关键词
estimating equation; haplotype; Hardy-Weinberg equilibrium; single nucleotide polymorphism (SNP);
D O I
10.1093/biostatistics/4.4.513
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Estimating haplotype frequencies becomes increasingly important in the mapping of complex disease genes, as millions of single nucleotide polymorphisms (SNPs) are being identified and genotyped. When genotypes at multiple SNP loci are gathered from unrelated individuals, haplotype frequencies can be accurately estimated using expectation-maximization (EM) algorithms (Excoffier and Slatkin, 1995; Hawley and Kidd, 1995; Long et al., 1995), with standard errors estimated using bootstraps. However, because the number of possible haplotypes increases exponentially with the number of SNPs, handling data with a large number of SNPs poses a computational challenge for the EM methods and for other haplotype inference methods. To solve this problem, Niu and colleagues, in their Bayesian haplotype inference paper (Niu et al., 2002), introduced a computational algorithm called progressive ligation (PL). But their Bayesian method has a limitation on the number of subjects (no more than 100 subjects in the current implementation of the method). In this paper, we propose a new method in which we use the same likelihood formulation as in Excoffier and Slatkin's EM algorithm and apply the estimating equation idea and the PL computational algorithm with some modifications. Our proposed method can handle data sets with large number of SNPs as well as large numbers of subjects. Simultaneously, our method estimates standard errors efficiently, using the sandwich-estimate from the estimating equation, rather than the bootstrap method. Additionally, our method admits missing data and produces valid estimates of parameters and their standard errors under the assumption that the missing genotypes are missing at random in the sense defined by Rubin (1976).
引用
收藏
页码:513 / 522
页数:10
相关论文
共 50 条
  • [41] ESTIMATING STANDARD ERRORS FOR IMPORTANCE SAMPLING ESTIMATORS WITH MULTIPLE MARKOV CHAINS
    Roy, Vivekananda
    Tan, Aixin
    Flegal, James M.
    STATISTICA SINICA, 2018, 28 (02) : 1079 - 1101
  • [42] SELPLG and SELP single-nucleotide polymorphisms in multiple sclerosis
    Fenoglio, C
    Galimberti, D
    Ban, M
    Maranian, M
    Scalabrini, D
    Venturelli, E
    Piccio, L
    De Riz, M
    Yeo, TW
    Goris, A
    Gray, J
    Bresolin, N
    Scarpini, E
    Compston, A
    Sawcer, S
    NEUROSCIENCE LETTERS, 2006, 394 (02) : 92 - 96
  • [43] Single nucleotide polymorphisms
    Ok, KY
    Kyu, KM
    Jong, WY
    Cheol, LM
    Hee, KJ
    Won, PK
    Young, KE
    Young, R
    Jong, KC
    EPILEPSIA, 2005, 46 : 259 - 259
  • [44] Association of the single nucleotide polymorphisms and the haplotype of the IL-18 gene with atopic dermatitis in Koreans
    Kim, E.
    Lee, J.
    Namkung, J.
    Park, J.
    Kim, S.
    Shin, E.
    Cho, E.
    Yang, J.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2007, 127 : S86 - S86
  • [46] Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms
    Qin, ZHS
    Niu, TH
    Liu, JS
    AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (05) : 1242 - 1247
  • [47] Association of single nucleotide polymorphisms and haplotype in SPINK5 gene with atopic dermatitis in Koreans
    Namkung, J.
    Lee, J.
    Kim, E.
    Byun, J.
    Kim, S.
    Shin, E.
    Cho, E.
    Yang, I.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2010, 130 : S88 - S88
  • [48] HAPLOTYPE OF MULTIPLE POLYMORPHISMS RESOLVED BY ENZYMATIC AMPLIFICATION OF SINGLE DNA-MOLECULES
    RUANO, G
    KIDD, KK
    STEPHENS, JC
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (16) : 6296 - 6300
  • [49] Frequencies of CTLA-4 single nucleotide polymorphisms in allergic asthma and healthy controls
    Winiarska, B
    Jasek, M
    Obojski, A
    Nowak, I
    Manczak, M
    Wisniewski, A
    Luszczek, W
    Kusnierczyk, P
    TISSUE ANTIGENS, 2004, 64 (04): : 416 - 416
  • [50] A method for finding single-nucleotide polymorphisms with allele frequencies in sequences of deep coverage
    Wang, JM
    Huang, XQ
    BMC BIOINFORMATICS, 2005, 6 (1)