Entropy regularization in probabilistic clustering

被引:0
|
作者
Franzolini, Beatrice [1 ]
Rebaudo, Giovanni [2 ,3 ]
机构
[1] Bocconi Univ, Dept Decis Sci, Milan, Italy
[2] Univ Turin, Turin, Italy
[3] Collegio Carlo Alberto, Turin, Italy
来源
STATISTICAL METHODS AND APPLICATIONS | 2024年 / 33卷 / 01期
关键词
Dirichlet process; Loss functions; Mixture models; Unbalanced clusters; Random partition; DIRICHLET PROCESS; PARTITION DISTRIBUTION; OUTLIER DETECTION; MIXTURE-MODELS; INFERENCE; NUMBER;
D O I
10.1007/s10260-023-00716-y
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Bayesian nonparametric mixture models are widely used to cluster observations. However, one major drawback of the approach is that the estimated partition often presents unbalanced clusters' frequencies with only a few dominating clusters and a large number of sparsely-populated ones. This feature translates into results that are often uninterpretable unless we accept to ignore a relevant number of observations and clusters. Interpreting the posterior distribution as penalized likelihood, we show how the unbalance can be explained as a direct consequence of the cost functions involved in estimating the partition. In light of our findings, we propose a novel Bayesian estimator of the clustering configuration. The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters and enhances interpretability. The procedure takes the form of entropy-regularization of the Bayesian estimate. While being computationally convenient with respect to alternative strategies, it is also theoretically justified as a correction to the Bayesian loss function used for point estimation and, as such, can be applied to any posterior distribution of clusters, regardless of the specific model used.
引用
收藏
页码:37 / 60
页数:24
相关论文
共 50 条
  • [1] Universal clustering with regularization in probabilistic space
    Nikulin, V
    Smola, AJ
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2005, 3587 : 142 - 152
  • [2] A CLUSTERING MODEL WITH RENYI ENTROPY REGULARIZATION
    Popescu, Costin Ciprian
    MATHEMATICAL REPORTS, 2009, 11 (01): : 59 - 65
  • [3] Fuzzy clustering: Consistency of entropy regularization
    Sahbi, H
    Boujemaa, N
    Computational Intelligence, Theory and Applications, 2005, : 95 - 107
  • [4] Entropy regularization for unsupervised clustering with adaptive neighbors
    Wang, Jingyu
    Ma, Zhenyu
    Nie, Feiping
    Li, Xuelong
    PATTERN RECOGNITION, 2022, 125
  • [5] Validity of fuzzy clustering using entropy regularization
    Sahbi, H
    Boujemaa, N
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 177 - 182
  • [6] Entropy based probabilistic collaborative clustering
    Sublime, Jeremie
    Matei, Basarab
    Cabanes, Guenael
    Grozavu, Nistor
    Bennani, Younes
    Cornuejols, Antoine
    PATTERN RECOGNITION, 2017, 72 : 144 - 157
  • [7] Entrack: Probabilistic Spherical Regression with Entropy Regularization for Fiber Tractography
    Wegmayr, Viktor
    Buhmann, Joachim M.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (03) : 656 - 680
  • [8] Entrack: Probabilistic Spherical Regression with Entropy Regularization for Fiber Tractography
    Viktor Wegmayr
    Joachim M. Buhmann
    International Journal of Computer Vision, 2021, 129 : 656 - 680
  • [9] Fuzzy clustering Algorithm with Automatic Variable Selection and Entropy Regularization
    Rodriguez, Sara I. R.
    de Carvalho, Francisco de A. T.
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [10] Fuzzy clustering algorithms with distance metric learning and entropy regularization
    Rodriguez, Sara I. R.
    de Carvalho, Francisco de A. T.
    APPLIED SOFT COMPUTING, 2021, 113