Replica analysis of Bayesian data clustering

被引:0
|
作者
Mozeika, Alexander [1 ]
Coolen, Anthony C. C. [2 ,3 ]
机构
[1] Kings Coll London, Inst Math & Mol Biomed, Hodgkin Bldg, London SE1 1UL, England
[2] Kings Coll London, Dept Math, London WC2R 2LS, England
[3] London Inst Math Sci, 35A South St, London W1K 2XF, England
基金
英国医学研究理事会;
关键词
clustering; Bayesian inference; replica; STATISTICAL-MECHANICS; CLASSIFICATION;
D O I
10.1088/1751-8121/ab59af
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We use statistical mechanics to study model-based Bayesian data clustering. In this approach, each partition of the data into clusters is regarded as a microscopic system state, the negative data log-likelihood gives the energy of each state, and the data set realisation acts as disorder. Optimal clustering corresponds to the ground state of the system, and is hence obtained from the free energy via a low 'temperature' limit. We assume that for large sample sizes the free energy density is self-averaging, and we use the replica method to compute the asymptotic free energy density. The main order parameter in the resulting (replica symmetric) theory, the distribution of the data over the clusters, satisfies a self-consistent equation which can be solved by a population dynamics algorithm. From this order parameter one computes the average free energy, and all relevant macroscopic characteristics of the problem. The theory describes numerical experiments perfectly, and gives a significant improvement over the mean-field theory that was used to study this model in past.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Replica-exchange Monte Carlo scheme for Bayesian data analysis
    Habeck, M
    Nilges, M
    Rieping, W
    PHYSICAL REVIEW LETTERS, 2005, 94 (01)
  • [2] Bayesian feedback in data clustering
    Jain, A. K.
    Mallapragada, Pavan K.
    Law, Martin
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 374 - +
  • [3] BAYESIAN CLUSTERING OF DATA SETS
    MENZEFRICKE, U
    COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1981, 10 (01): : 65 - 77
  • [4] Bayesian clustering with uncertain data
    Nicholls, Kath
    Kirk, Paul D. W.
    Wallace, Chris
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (09)
  • [5] Unsupervised bayesian clustering for functional data
    Juery, Damien
    Abraham, Christophe
    Fontez, Benedicte
    JOURNAL OF THE SFDS, 2014, 155 (02): : 185 - 201
  • [6] Effects of additional data on Bayesian clustering
    Yamazaki, Keisuke
    NEURAL NETWORKS, 2017, 94 : 86 - 95
  • [7] Bayesian validation of fuzzy clustering for analysis of yeast cell cycle data
    Kim, KJ
    Yoo, SH
    Cho, SB
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2005, 3683 : 777 - 784
  • [8] Parameter clustering in Bayesian functional principal component analysis of neuroscientific data
    Margaritella, Nicolo
    Inacio, Vanda
    King, Ruth
    STATISTICS IN MEDICINE, 2021, 40 (01) : 167 - 184
  • [9] Bayesian cluster analysis for registration and clustering homogeneous subgroups in multidimensional functional data
    Fradi, Anis
    Samir, Chafik
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (07) : 2242 - 2258
  • [10] Bayesian nonparametric clustering for large data sets
    Daiane Aparecida Zuanetti
    Peter Müller
    Yitan Zhu
    Shengjie Yang
    Yuan Ji
    Statistics and Computing, 2019, 29 : 203 - 215