Bayesian hierarchical hypothesis testing in large-scale genome-wide association analysis

被引:0
|
作者
Samaddar, Anirban [1 ]
Maiti, Tapabrata [1 ]
de los Campos, Gustavo [1 ,2 ,3 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA
[3] Michigan State Univ, Inst Quantitat Hlth Sci & Engn, E Lansing, MI 48824 USA
关键词
Bayesian variable selection; Bayesian hierarchical hypothesis testing; false discovery rate; GWAS; collinearity; multiresolution inference; spike and slab prior; linkage disequilibrium; UK-Biobank data; FALSE DISCOVERY RATE; VARIABLE-SELECTION; REGRESSION; HERITABILITY; PREDICTION;
D O I
10.1093/genetics/iyae164
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Variable selection and large-scale hypothesis testing are techniques commonly used to analyze high-dimensional genomic data. Despite recent advances in theory and methodology, variable selection and inference with highly collinear features remain challenging. For instance, collinearity poses a great challenge in genome-wide association studies involving millions of variants, many of which may be in high linkage disequilibrium. In such settings, collinearity can significantly reduce the power of variable selection methods to identify individual variants associated with an outcome. To address such challenges, we developed a Bayesian hierarchical hypothesis testing (BHHT)-a novel multiresolution testing procedure that offers high power with adequate error control and fine-mapping resolution. We demonstrate through simulations that the proposed methodology has a power-FDR performance that is competitive with (and in many scenarios better than) state-of-the-art methods. Finally, we demonstrate the feasibility of using BHHT with large sample size ( n similar to 300,000) and ultra dimensional genotypes (similar to 15 million single-nucleotide polymorphisms or SNPs) by applying it to eight complex traits using data from the UK-Biobank. Our results show that the proposed methodology leads to many more discoveries than those obtained using traditional SNP-centered inference procedures. The article is accompanied by open-source software that implements the methods described in this study using algorithms that scale to biobank-size ultra-high-dimensional data.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Analysis of genome-wide association data by large-scale Bayesian logistic regression
    Yuanjia Wang
    Nanshi Sha
    Yixin Fang
    BMC Proceedings, 3 (Suppl 7)
  • [2] Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies
    Sun, Lei
    Craiu, Radu V.
    Paterson, Andrew D.
    Bull, Shelley B.
    GENETIC EPIDEMIOLOGY, 2006, 30 (06) : 519 - 530
  • [3] Evaluating the dopamine hypothesis of schizophrenia in a large-scale genome-wide association study
    Edwards, Alexis C.
    Bacanu, Silviu-Alin
    Bigdeli, Tim B.
    Moscati, Arden
    Kendler, Kenneth S.
    SCHIZOPHRENIA RESEARCH, 2016, 176 (2-3) : 136 - 140
  • [4] Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies
    Liu, Zhonghua
    Shen, Jincheng
    Barfield, Richard
    Schwartz, Joel
    Baccarelli, Andrea A.
    Lin, Xihong
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (537) : 67 - 81
  • [5] BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES
    Zhu, Xiang
    Stephens, Matthew
    ANNALS OF APPLIED STATISTICS, 2017, 11 (03): : 1561 - 1592
  • [6] BAYESIAN VARIABLE SELECTION REGRESSION FOR GENOME-WIDE ASSOCIATION STUDIES AND OTHER LARGE-SCALE PROBLEMS
    Guan, Yongtao
    Stephens, Matthew
    ANNALS OF APPLIED STATISTICS, 2011, 5 (03): : 1780 - 1815
  • [7] Genome-wide association studies and large-scale collaborations in epidemiology
    Psaty, Bruce M.
    Hofman, Albert
    EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2010, 25 (08) : 525 - 529
  • [8] Genome-wide association studies and large-scale collaborations in epidemiology
    Bruce M. Psaty
    Albert Hofman
    European Journal of Epidemiology, 2010, 25 : 525 - 529
  • [9] A Bayesian Hierarchical Framework for Pathway Analysis in Genome-Wide Association Studies
    Zhang, Lei
    Papachristou, Charalampos
    Choudhary, Pankaj K.
    Biswas, Swati
    HUMAN HEREDITY, 2020, 84 (06) : 240 - 255
  • [10] A Large-Scale Genome-Wide Association Study in US Holstein Cattle
    Jiang, Jicai
    Ma, Li
    Prakapenka, Dzianis
    VanRaden, Paul M.
    Cole, John B.
    Da, Yang
    FRONTIERS IN GENETICS, 2019, 10