Entropy regularization in probabilistic clustering

被引:0
|
作者
Franzolini, Beatrice [1 ]
Rebaudo, Giovanni [2 ,3 ]
机构
[1] Bocconi Univ, Dept Decis Sci, Milan, Italy
[2] Univ Turin, Turin, Italy
[3] Collegio Carlo Alberto, Turin, Italy
来源
STATISTICAL METHODS AND APPLICATIONS | 2024年 / 33卷 / 01期
关键词
Dirichlet process; Loss functions; Mixture models; Unbalanced clusters; Random partition; DIRICHLET PROCESS; PARTITION DISTRIBUTION; OUTLIER DETECTION; MIXTURE-MODELS; INFERENCE; NUMBER;
D O I
10.1007/s10260-023-00716-y
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Bayesian nonparametric mixture models are widely used to cluster observations. However, one major drawback of the approach is that the estimated partition often presents unbalanced clusters' frequencies with only a few dominating clusters and a large number of sparsely-populated ones. This feature translates into results that are often uninterpretable unless we accept to ignore a relevant number of observations and clusters. Interpreting the posterior distribution as penalized likelihood, we show how the unbalance can be explained as a direct consequence of the cost functions involved in estimating the partition. In light of our findings, we propose a novel Bayesian estimator of the clustering configuration. The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters and enhances interpretability. The procedure takes the form of entropy-regularization of the Bayesian estimate. While being computationally convenient with respect to alternative strategies, it is also theoretically justified as a correction to the Bayesian loss function used for point estimation and, as such, can be applied to any posterior distribution of clusters, regardless of the specific model used.
引用
收藏
页码:37 / 60
页数:24
相关论文
共 50 条
  • [11] Semi-supervised fuzzy clustering with metric learning and entropy regularization
    Yin, Xuesong
    Shu, Ting
    Huang, Qi
    KNOWLEDGE-BASED SYSTEMS, 2012, 35 : 304 - 311
  • [12] CHARACTERIZATION OF FUZZY CLUSTERING ALGORITHMS IN TERMS OF ENTROPY OF PROBABILISTIC SETS
    HIROTA, K
    PEDRYCZ, W
    PATTERN RECOGNITION LETTERS, 1984, 2 (04) : 213 - 216
  • [13] Regularization, maximum entropy and probabilistic methods in mass spectrometry data processing problems
    Mohammad-Djafari, A
    Giovannelli, JF
    Demoment, G
    Idier, J
    INTERNATIONAL JOURNAL OF MASS SPECTROMETRY, 2002, 215 (1-3) : 175 - 193
  • [14] Classification with Incomplete Probabilistic Labeling Based on Manifold Regularization and Fuzzy Clustering Ensemble
    Berikov, V. B.
    Vikent'ev, A. A.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (03) : 515 - 518
  • [15] Classification with Incomplete Probabilistic Labeling Based on Manifold Regularization and Fuzzy Clustering Ensemble
    V. B. Berikov
    A. A. Vikent’ev
    Pattern Recognition and Image Analysis, 2022, 32 : 515 - 518
  • [16] Binary Clustering of Color Images by Fuzzy Co-Clustering with Non-Extensive Entropy Regularization
    Susan, Seba
    Agarwal, Meetu
    Agarwal, Seetu
    Kartikeya, Anand
    Meena, Ritu
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2016, : 512 - 517
  • [17] Entropy-minimization clustering technique for probabilistic packet marking scheme
    Tan, WP
    Lee, BS
    Lee, HCJ
    2004 12TH IEEE INTERNATIONAL CONFERENCE ON NETWORKS, VOLS 1 AND 2 , PROCEEDINGS: UNITY IN DIVERSITY, 2004, : 292 - 298
  • [18] PROBABILISTIC ANALYSIS OF REGULARIZATION
    KEREN, D
    WERMAN, M
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (10) : 982 - 995
  • [19] Fuzzy clustering Algorithm based on Adaptive City-block distance and Entropy Regularization
    Rodriguez, Sara I. R.
    de Carvalho, Francisco de A. T.
    2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [20] Regularization of currents and entropy
    Dinh, TC
    Sibony, N
    ANNALES SCIENTIFIQUES DE L ECOLE NORMALE SUPERIEURE, 2004, 37 (06): : 959 - 971