A Graphical Model for Fusing Diverse Microbiome Data

被引:1
|
作者
Aktukmak, Mehmet [1 ]
Zhu, Haonan [1 ]
Chevrette, Marc G. [2 ,3 ]
Nepper, Julia [3 ]
Magesh, Shruthi [3 ,4 ]
Handelsman, Jo [2 ,3 ]
Hero, Alfred [1 ]
机构
[1] Univ Michigan, Dept Elect & Comp Engn, Ann Arbor, MI 48109 USA
[2] Univ Wisconsin, Dept Plant Pathol, Madison, WI 53706 USA
[3] Wisconsin Inst Discovery, Madison, WI 53715 USA
[4] Univ Wisconsin, Microbiol Doctoral Training Program, Madison, WI 53706 USA
关键词
Bayesian probabilistic graphical model; data fusion; microbial data analysis; variational optimization; VARIATIONAL INFERENCE; UNDERSTAND; LASSO;
D O I
10.1109/TSP.2023.3309464
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper develops a Bayesian graphical model for fusing disparate types of count data. The motivating application is the study of bacterial communities from diverse high-dimensional features, in this case, transcripts, collected from different treatments. In such datasets, there are no explicit correspondences between the communities and each corresponds to different factors, making data fusion challenging. We introduce a flexible multinomial-Gaussian generative model for jointly modeling such count data. This latent variable model jointly characterizes the observed data through a common multivariate Gaussian latent space that parameterizes the set of multinomial probabilities of the transcriptome counts. The covariance matrix of the latent variables induces a covariance matrix of co-dependencies between all the transcripts, effectively fusing multiple data sources. We present a computationally scalable variational Expectation-Maximization (EM) algorithm for inferring the latent variables and the parameters of the model. The inferred latent variables provide a common dimensionality reduction for visualizing the data and the inferred parameters provide a predictive posterior distribution. In addition to simulation studies that demonstrate the variational EM procedure, we apply our model to a bacterial microbiome dataset.
引用
收藏
页码:3399 / 3412
页数:14
相关论文
共 50 条
  • [1] gmmcoda: Graphical Model for the Mixture of Compositional Data and Absolute Abundance Data with Applications to Microbiome Studies
    Zhang, Shen
    Fang, Huaying
    Hu, Tao
    GENETIC EPIDEMIOLOGY, 2024, 48 (07) : 353 - 353
  • [2] Bayesian Graphical Compositional Regression for Microbiome Data
    Mao, Jialiang
    Chen, Yuhan
    Ma, Li
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 610 - 624
  • [3] Gene network inference by fusing data from diverse distributions
    Zitnik, Marinka
    Zupan, Blaz
    BIOINFORMATICS, 2015, 31 (12) : 230 - 239
  • [4] Fusing diverse algorithms
    Elder, JF
    MINING AND MODELING MASSIVE DATA SETS IN SCIENCE, ENGINEERING, AND BUSINESS WITH A SUBTHEME IN ENVIRONMENTAL STATISTICS, 1997, 29 (01): : 560 - 560
  • [5] Random Graphical Model of Microbiome Interactions in Related Environments
    Vinciotti, Veronica
    Wit, Ernst C.
    Richter, Francisco
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2024,
  • [6] gmcoda: Graphical model for multiple compositional vectors in microbiome studies
    Fang, Huaying
    BIOINFORMATICS, 2023, 39 (11)
  • [7] A parametric model for fusing heterogeneous fuzzy data
    Hathaway, RJ
    Bezdek, JC
    Pedrycz, W
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1996, 4 (03) : 270 - 281
  • [8] GenePiper, a Graphical User Interface Tool for Microbiome Sequence Data Mining
    Tong, W. M.
    Chan, Yuki
    MICROBIOLOGY RESOURCE ANNOUNCEMENTS, 2020, 9 (01):
  • [9] Constructing probabilistic graphical model from predicate formulas for fusing logical and probabilistic knowledge
    Liu, Wei-Yi
    Yue, Kun
    Gao, Ming-Hai
    INFORMATION SCIENCES, 2011, 181 (18) : 3828 - 3845
  • [10] Graphical model for mixed data types
    Wu, Qiying
    Wang, Huiwen
    Lu, Shan
    Sun, Hui
    NEUROCOMPUTING, 2025, 611