A unified framework for model-based clustering

被引:92
|
作者
Zhong, S [1 ]
Ghosh, J
机构
[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
[2] Univ Texas, Dept Elect & Comp Engn, Austin, TX 78712 USA
关键词
model-based clustering; similarity-based clustering; partitional clustering; hierarchical agglomerative clustering; deterministic annealing;
D O I
10.1162/1532443041827943
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model-based clustering techniques have been widely used and have shown promising results in many applications involving complex data. This paper presents a unified framework for probabilistic model-based clustering based on a bipartite graph view of data and models that highlights the commonalities and differences among existing model-based clustering algorithms. In this view, clusters are represented as probabilistic models in a model space that is conceptually separate from the data space. For partitional clustering, the view is conceptually similar to the Expectation-Maximization (EM) algorithm. For hierarchical clustering, the graph-based view helps to visualize critical/important distinctions between similarity-based approaches and model-based approaches. The framework also suggests several useful variations of existing clustering algorithms. Two new variations-balanced model-based clustering and hybrid model-based clustering-are discussed and empirically evaluated on a variety of data types.
引用
收藏
页码:1001 / 1037
页数:37
相关论文
共 50 条
  • [1] A normalized criterion of spatial clustering in model-based framework
    Wang, X. Z.
    Grall-Maes, E.
    Beauseroy, P.
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 542 - 547
  • [2] Model-Based Clustering
    Paul D. McNicholas
    [J]. Journal of Classification, 2016, 33 : 331 - 373
  • [3] Model-Based Clustering
    Gormley, Isobel Claire
    Murphy, Thomas Brendan
    Raftery, Adrian E.
    [J]. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2023, 10 : 573 - 595
  • [4] Model-Based Clustering
    McNicholas, Paul D.
    [J]. JOURNAL OF CLASSIFICATION, 2016, 33 (03) : 331 - 373
  • [5] Unified framework for model-based optimal allocation of crop areas and water
    Linker, Raphael
    [J]. AGRICULTURAL WATER MANAGEMENT, 2020, 228
  • [6] Model-based clustering with envelopes
    Wang, Wenjing
    Zhang, Xin
    Mai, Qing
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01): : 82 - 109
  • [7] Model-based linear clustering
    Yan, Guohua
    Welch, William J.
    Zamar, Ruben H.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (04): : 716 - 737
  • [8] Challenges in model-based clustering
    Melnykov, Volodymyr
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (02): : 135 - 148
  • [9] Model-Based Edge Clustering
    Sewell, Daniel K.
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2021, 30 (02) : 390 - 405
  • [10] Model-Based Clustering with HDBSCAN
    Strobl, Michael
    Sander, Joerg
    Campello, Ricardo J. G. B.
    Zaiane, Osmar
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT II, 2021, 12458 : 364 - 379