A Bayesian Approach for Model-Based Clustering of Several Binary Dissimilarity Matrices: The dmbc Package in R

被引:1
|
作者
Venturini, Sergio [1 ]
Piccarreta, Raffaella [2 ]
机构
[1] Univ Cattolica Sacro Cuore, Dipartimento Sci Econ & Sociali, Via Bissolati 74, I-26100 Cremona, Italy
[2] Univ Commerciale L Bocconi, Dipartimento Sci Decis, Via Rontgen 1, I-20136 Milan, Italy
来源
JOURNAL OF STATISTICAL SOFTWARE | 2021年 / 100卷 / 16期
关键词
Bayesian data analysis; dissimilarity matrices; information criteria; multidimensional scaling; MCMC; MDS; mixture models; model-based clustering; three-way MDS; CHAIN MONTE-CARLO; EMPIRICAL PAIRWISE ORDERINGS; MAXIMUM-LIKELIHOOD METHOD; INDIVIDUAL-DIFFERENCES; MIXTURE;
D O I
10.18673/jss.v100.i16
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We introduce the new package dmbc that implements a Bayesian algorithm for clustering a set of binary dissimilarity matrices within a model-based framework. Specifically, we consider the case when S matrices are available, each describing the dissimilarities among the same n objects, possibly expressed by S subjects (judges), or measured under different experimental conditions, or with reference to different characteristics of the objects themselves. In particular, we focus on binary dissimilarities, taking values 0 or 1 depending on whether or not two objects are deemed as dissimilar. We are interested in analyzing such data using multidimensional scaling (MDS). Differently from standard MDS algorithms, our goal is to cluster the dissimilarity matrices and, simultaneously, to extract an MDS configuration specific for each cluster. To this end, we develop a fully Bayesian three-way MDS approach, where the elements of each dissimilarity matrix are modeled as a mixture of Bernoulli random vectors. The parameter estimates and the MDS configurations are derived using a hybrid Metropolis-Gibbs Markov Chain Monte Carlo algorithm. We also propose a BIC-like criterion for jointly selecting the optimal number of clusters and latent space dimensions. We illustrate our approach referring both to synthetic data and to a publicly available data set taken from the literature. For the sake of efficiency, the core computations in the package are implemented in C/C++. The package also allows the simulation of multiple chains through the support of the parallel package.
引用
收藏
页码:1 / 35
页数:35
相关论文
共 50 条
  • [31] Bayesian model-based clustering of temporal gene expression using autoregressive panel data approach
    Nascimento, Moyses
    Safadi, Thelma
    Fonseca e Silva, Fabyano
    Nascimento, Ana Carolina C.
    [J]. BIOINFORMATICS, 2012, 28 (15) : 2004 - 2007
  • [32] Bayesian approach to model-based extrapolation of nuclear observables
    Neufcourt, Leo
    Cao, Yuchen
    Nazarewicz, Witold
    Viens, Frederi
    [J]. PHYSICAL REVIEW C, 2018, 98 (03)
  • [33] A model-based hierarchical Bayesian approach to Sholl analysis
    Vonkaenel, Erik
    Feidler, Alexis
    Lowery, Rebecca
    Andersh, Katherine
    Love, Tanzy
    Majewska, Ania
    Mccall, Matthew N.
    [J]. BIOINFORMATICS, 2024, 40 (04)
  • [34] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [35] Bayesian deep learning: A model-based interpretable approach
    Matsubara, Takashi
    [J]. IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2020, 11 (01): : 16 - 35
  • [36] Bayesian optimization approach to model-based description of α decay
    Jin, Zisheng
    Yan, Mingshuai
    Zhou, Hao
    Cheng, An
    Ren, Zhongzhou
    Liu, Jian
    [J]. PHYSICAL REVIEW C, 2023, 108 (01)
  • [37] Evaluation of a Bayesian model-based approach in GA studies
    Hullam, Gabor
    Antal, Peter
    Szalai, Csaba
    Falus, Andras
    [J]. PROCEEDINGS OF THE THIRD INTERNATIONAL WORKSHOP ON MACHINE LEARNING IN SYSTEMS BIOLOGY, 2010, 8 : 30 - 43
  • [38] Improved model-based clustering performance using Bayesian initialization averaging
    Adrian O’Hagan
    Arthur White
    [J]. Computational Statistics, 2019, 34 : 201 - 231
  • [39] Improved model-based clustering performance using Bayesian initialization averaging
    O'Hagan, Adrian
    White, Arthur
    [J]. COMPUTATIONAL STATISTICS, 2019, 34 (01) : 201 - 231
  • [40] A Model-based Approach for Text Clustering with Outlier Detection
    Yin, Jianhua
    Wang, Jianyong
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 625 - 636