Replica analysis of Bayesian data clustering

被引:0
|
作者
Mozeika, Alexander [1 ]
Coolen, Anthony C. C. [2 ,3 ]
机构
[1] Kings Coll London, Inst Math & Mol Biomed, Hodgkin Bldg, London SE1 1UL, England
[2] Kings Coll London, Dept Math, London WC2R 2LS, England
[3] London Inst Math Sci, 35A South St, London W1K 2XF, England
基金
英国医学研究理事会;
关键词
clustering; Bayesian inference; replica; STATISTICAL-MECHANICS; CLASSIFICATION;
D O I
10.1088/1751-8121/ab59af
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We use statistical mechanics to study model-based Bayesian data clustering. In this approach, each partition of the data into clusters is regarded as a microscopic system state, the negative data log-likelihood gives the energy of each state, and the data set realisation acts as disorder. Optimal clustering corresponds to the ground state of the system, and is hence obtained from the free energy via a low 'temperature' limit. We assume that for large sample sizes the free energy density is self-averaging, and we use the replica method to compute the asymptotic free energy density. The main order parameter in the resulting (replica symmetric) theory, the distribution of the data over the clusters, satisfies a self-consistent equation which can be solved by a population dynamics algorithm. From this order parameter one computes the average free energy, and all relevant macroscopic characteristics of the problem. The theory describes numerical experiments perfectly, and gives a significant improvement over the mean-field theory that was used to study this model in past.
引用
收藏
页数:32
相关论文
共 50 条
  • [31] CONVENTIONAL AND BAYESIAN VALIDATION FOR FUZZY CLUSTERING ANALYSIS
    Limam, Olfa
    Ben Abdelaziz, Fouad
    ICFC 2010/ ICNC 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION AND INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION, 2010, : 135 - 140
  • [32] Clustering and Data Analysis
    丁立人
    留学, 2018, (19) : 80 - 81
  • [33] Bayesian clustering of distributions in stochastic frontier analysis
    J. E. Griffin
    Journal of Productivity Analysis, 2011, 36 : 275 - 283
  • [34] Contrastive analysis of English literature comparative literature based on Bayesian clustering approach to big data
    Jiang Li
    Cluster Computing, 2019, 22 : 7031 - 7037
  • [35] Bayesian analysis of data from segmented super-resolution images for quantifying protein clustering
    Kosuta, Tina
    Cullell-Dalmau, Marta
    Cella Zanacchi, Francesca
    Manzo, Carlo
    PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2020, 22 (03) : 1107 - 1114
  • [36] Contrastive analysis of English literature comparative literature based on Bayesian clustering approach to big data
    Li, Jiang
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S7031 - S7037
  • [37] Compressed Sensing Performance Analysis via Replica Method using Bayesian framework
    Tesfamicael, Solomon A.
    Godana, Bruhtesfa E.
    2015 17TH UKSIM-AMSS INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2015, : 281 - 289
  • [38] Coarse master equation from Bayesian analysis of replica molecular dynamics simulations
    Sriraman, S
    Kevrekidis, LG
    Hummer, G
    JOURNAL OF PHYSICAL CHEMISTRY B, 2005, 109 (14): : 6479 - 6484
  • [39] R/BHC: fast Bayesian hierarchical clustering for microarray data
    Richard S Savage
    Katherine Heller
    Yang Xu
    Zoubin Ghahramani
    William M Truman
    Murray Grant
    Katherine J Denby
    David L Wild
    BMC Bioinformatics, 10
  • [40] Spike sorting: Bayesian clustering of non-stationary data
    Bar-Hillel, Aharon
    Spiro, Adam
    Stark, Eran
    JOURNAL OF NEUROSCIENCE METHODS, 2006, 157 (02) : 303 - 316