Interpretable artificial neural networks incorporating Bayesian alphabet models for genome-wide prediction and association studies

被引:10
|
作者
Zhao, Tianjing [1 ,2 ]
Fernando, Rohan [3 ]
Cheng, Hao [1 ]
机构
[1] Univ Calif Davis, Dept Anim Sci, Davis, CA 95616 USA
[2] Univ Calif Davis, Integrat Genet & Genom Grad Grp, Davis, CA 95616 USA
[3] Iowa State Univ, Dept Anim Sci, Ames, IA 50011 USA
来源
G3-GENES GENOMES GENETICS | 2021年 / 11卷 / 10期
基金
美国农业部;
关键词
neural networks; Bayesian regression models; JWAS; genomic prediction; GWAS; SELECTION; POLYMORPHISM; SINGLE; GWAS; PHENOTYPES; ACCURACY; GENE;
D O I
10.1093/g3journal/jkab228
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
In conventional linear models for whole-genome prediction and genome-wide association studies (GWAS), it is usually assumed that the relationship between genotypes and phenotypes is linear. Bayesian neural networks have been used to account for non-linearity such as complex genetic architectures. Here, we introduce a method named NN-Bayes, where "NN" stands for neural networks, and "Bayes" stands for Bayesian Alphabet models, including a collection of Bayesian regression models such as BayesA, BayesB, BayesC, and Bayesian LASSO. NN-Bayes incorporates Bayesian Alphabet models into non-linear neural networks via hidden layers between single-nucleotide polymorphisms (SNPs) and observed traits. Thus, NN-Bayes attempts to improve the performance of genome-wide prediction and GWAS by accommodating non-linear relationships between the hidden nodes and the observed trait, while maintaining genomic interpretability through the Bayesian regression models that connect the SNPs to the hidden nodes. For genomic interpretability, the posterior distribution of marker effects in NN-Bayes is inferred by Markov chain Monte Carlo approaches and used for inference of association through posterior inclusion probabilities and window posterior probability of association. In simulation studies with dominance and epistatic effects, performance of NN-Bayes was significantly better than conventional linear models for both GWAS and whole-genome prediction, and the differences on prediction accuracy were substantial in magnitude. In real-data analyses, for the soy dataset, NN-Bayes achieved significantly higher prediction accuracies than conventional linear models, and results from other four different species showed that NN-Bayes had similar prediction performance to linear models, which is potentially due to the small sample size. Our NN-Bayes is optimized for high-dimensional genomic data and implemented in an open-source package called "JWAS." NN-Bayes can lead to greater use of Bayesian neural networks to account for non-linear relationships due to its interpretability and computational performance.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Learning Hierarchical Bayesian Networks for Genome-Wide Association Studies
    Mourad, Raphael
    Sinoquet, Christine
    Leray, Philippe
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 549 - 556
  • [2] Bayesian Inference of Multilocus Models in Genome-wide Association Studies
    Briollais, L.
    Liu, J.
    Dobra, A.
    Massam, H.
    [J]. GENETIC EPIDEMIOLOGY, 2008, 32 (07) : 680 - 681
  • [3] The Bayesian lasso for genome-wide association studies
    Li, Jiahan
    Das, Kiranmoy
    Fu, Guifang
    Li, Runze
    Wu, Rongling
    [J]. BIOINFORMATICS, 2011, 27 (04) : 516 - 523
  • [4] Strategies for Developing Prediction Models From Genome-Wide Association Studies
    Wu, Jincao
    Pfeiffer, Ruth M.
    Gail, Mitchell H.
    [J]. GENETIC EPIDEMIOLOGY, 2013, 37 (08) : 768 - 777
  • [5] A Bayesian framework for generalized linear mixed models in genome-wide association studies
    Wang, X.
    Philip, V.
    Carter, G.
    [J]. HUMAN GENOMICS, 2016, 10
  • [6] Application of Whole-Genome Prediction Methods for Genome-Wide Association Studies: A Bayesian Approach
    Rohan Fernando
    Ali Toosi
    Anna Wolc
    Dorian Garrick
    Jack Dekkers
    [J]. Journal of Agricultural, Biological and Environmental Statistics, 2017, 22 : 172 - 193
  • [7] Application of Whole-Genome Prediction Methods for Genome-Wide Association Studies: A Bayesian Approach
    Fernando, Rohan
    Toosi, Ali
    Wolc, Anna
    Garrick, Dorian
    Dekkers, Jack
    [J]. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2017, 22 (02) : 172 - 193
  • [8] Bayesian Variable Selection with Genome-wide Association Studies
    Bangchang, Kannat Na
    [J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2024, 45 (02) : 613 - 620
  • [9] Bayesian Centroid Inference for Genome-Wide Association Studies
    Carvalho, Luis E.
    [J]. GENETIC EPIDEMIOLOGY, 2010, 34 (08) : 981 - 982
  • [10] Sparse Convolutional Neural Networks for Genome-Wide Prediction
    Waldmann, Patrik
    Pfeiffer, Christina
    Meszaros, Gabor
    [J]. FRONTIERS IN GENETICS, 2020, 11