Unidimensional Clustering of Discrete Data Using Latent Tree Models

被引:0
|
作者
Liu, April H. [1 ]
Poon, Leonard K. M. [2 ]
Zhang, Nevin L. [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Hong Kong Inst Educ, Dept Math & Informat Technol, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2015年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with model-based clustering of discrete data. Latent class models (LCMs) are usually used for the task. An LCM consists of a latent variable and a number of attributes. It makes the overly restrictive assumption that the attributes are mutually independent given the latent variable. We propose a novel method to relax the assumption. The key idea is to partition the attributes into groups such that correlations among the attributes in each group can be properly modeled by using one single latent variable. The latent variables for the attribute groups are then used to build a number of models and one of them is chosen to produce the clustering results. Extensive empirical studies have been conducted to compare the new method with LCM and several other methods (K-means, kernel K means and spectral clustering) that are not model-based. The new method outperforms the alternative methods in most cases and the differences are often large.
引用
收藏
页码:2771 / 2777
页数:7
相关论文
共 50 条
  • [21] Latent variable models for the topographic organisation of discrete and strictly positive data
    Girolami, M
    NEUROCOMPUTING, 2002, 48 : 185 - 198
  • [22] Fast Decoding in Sequence Models Using Discrete Latent Variables
    Kaiser, Lukasz
    Roy, Aurko
    Vaswani, Ashish
    Parmar, Niki
    Bengio, Samy
    Uszkoreit, Jakob
    Shazeer, Noam
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [23] An Inequality for Correlations in Unidimensional Monotone Latent Variable Models for Binary Variables
    Ellis, Jules L.
    PSYCHOMETRIKA, 2014, 79 (02) : 303 - 316
  • [24] An Inequality for Correlations in Unidimensional Monotone Latent Variable Models for Binary Variables
    Jules L. Ellis
    Psychometrika, 2014, 79 : 303 - 316
  • [25] Bayesian mixtures of Hidden Tree Markov Models for structured data clustering
    Bacciu, Davide
    Castellana, Daniele
    NEUROCOMPUTING, 2019, 342 : 49 - 59
  • [26] Clustering discrete-valued data using biological and molecular data
    Wong, AKC
    Chiu, DKY
    Huang, WH
    PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : A794 - A797
  • [27] Directed Clustering of Multivariate Data Based on Linear or Quadratic Latent Variable Models
    Zhang, Yingjuan
    Einbeck, Jochen
    ALGORITHMS, 2024, 17 (08)
  • [28] Estimation of parameters in latent class models using fuzzy clustering algorithms
    Yang, MS
    Yu, NY
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2005, 160 (02) : 515 - 531
  • [29] Etiologic inertia: Using latent variable models to address risk clustering
    Glass, T. A.
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2008, 167 (11) : S43 - S43
  • [30] Learning Latent Tree Graphical Models
    Choi, Myung Jin
    Tan, Vincent Y. F.
    Anandkumar, Animashree
    Willsky, Alan S.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1771 - 1812