Gaussian Clusters and Noise: An Approach Based on the Minimum Description Length Principle

被引:0
|
作者
Luosto, Panu [1 ]
Kivinen, Jyrki [1 ]
Mannila, Heikki [2 ]
机构
[1] Univ Helsinki, Dept Comp Sci, FIN-00014 Helsinki, Finland
[2] Aalto Univ, Dept Informat & Comp Sci, Helsinki, Finland
来源
DISCOVERY SCIENCE, DS 2010 | 2010年 / 6332卷
关键词
STOCHASTIC COMPLEXITY; INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a well-grounded minimum description length (MDL) based quality measure for a clustering consisting of either spherical or axis-aligned normally distributed clusters and a cluster with a uniform distribution in an axis-aligned rectangular box. The uniform component extends the practical usability of the model e. g. in the presence of noise, and using the MDL principle for the model selection makes comparing the quality of clusterings with a different number of clusters possible. We also introduce a novel search heuristic for finding the best clustering with an unknown number of clusters. The heuristic is based on the idea of moving points from the Gaussian clusters to the uniform one and using MDL for determining the optimal amount of noise. Tests with synthetic data having a clear cluster structure imply that the search method is effective in finding the intuitively correct clustering.
引用
收藏
页码:251 / 265
页数:15
相关论文
共 50 条
  • [21] The minimum description length principle and model selection in spectropolarimetry
    Ramos, A. Asensio
    ASTROPHYSICAL JOURNAL, 2006, 646 (02): : 1445 - 1451
  • [22] The minimum description length principle for pattern mining: a survey
    Galbrun, Esther
    DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (05) : 1679 - 1727
  • [23] The minimum description length principle for pattern mining: a survey
    Esther Galbrun
    Data Mining and Knowledge Discovery, 2022, 36 : 1679 - 1727
  • [24] Optimal work extraction and the minimum description length principle
    Touzo, Leo
    Marsili, Matteo
    Merhav, Neri
    Roldan, Edgar
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2020, 2020 (09):
  • [25] Adaptive ripple down rules method based on minimum description length principle
    Yoshida, T
    Wada, T
    Motoda, H
    Washio, T
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 530 - 537
  • [26] Increasing generalizability via the principle of minimum description length Comment
    Bonifay, Wes
    BEHAVIORAL AND BRAIN SCIENCES, 2022, 45
  • [27] Minimum Description Length Principle for Fat-Tailed Distributions
    Nonchev, Bono
    NONLINEAR DYNAMICS OF ELECTRONIC SYSTEMS, 2014, 438 : 68 - 75
  • [28] Minimum Description Length Principle in Supervised Learning With Application to Lasso
    Kawakita, Masanori
    Takeuchi, Jun'ichi
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2020, 66 (07) : 4245 - 4269
  • [29] MINIMUM DESCRIPTION LENGTH PRINCIPLE FOR LINEAR MIXED EFFECTS MODELS
    Li, Li
    Yao, Fang
    Craiu, Radu V.
    Zou, Jialin
    STATISTICA SINICA, 2014, 24 (03) : 1161 - 1178
  • [30] Evaluating the significance of sequence motifs by the minimum description length principle
    Ma, QC
    Wang, JTL
    PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : A798 - A801