Unsupervised learning of Dirichlet process mixture models with missing data

被引:0
|
作者
Xunan ZHANG [1 ]
Shiji SONG [1 ]
Lei ZHU [2 ]
Keyou YOU [1 ]
Cheng WU [1 ]
机构
[1] Department of Automation, Tsinghua University
[2] China Ocean Mineral Resources R&D Association
基金
中国国家自然科学基金;
关键词
Dirichlet processes; missing data; clustering; variational Bayesian; image analysis;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
This study presents a novel approach to unsupervised learning for clustering with missing data.We first extend a finite mixture model to the infinite case by considering Dirichlet process mixtures, which can automatically determine the number of mixture components or clusters. Furthermore, we view the missing features as latent variables and compute the posterior distributions using the variational Bayesian expectation maximization algorithm, which optimizes the evidence lower bound on the complete-data log marginal likelihood. We demonstrate the performance on several artificial data sets with missing values. The experimental results indicate that the proposed method outperforms some classic imputation methods. We finally present an application to seabed hydrothermal sulfide color images analysis problem.
引用
收藏
页码:161 / 174
页数:14
相关论文
共 50 条
  • [1] Unsupervised learning of Dirichlet process mixture models with missing data
    Zhang, Xunan
    Song, Shiji
    Zhu, Lei
    You, Keyou
    Wu, Cheng
    SCIENCE CHINA-INFORMATION SCIENCES, 2016, 59 (01) : 1 - 14
  • [2] Unsupervised learning of Dirichlet process mixture models with missing data面向缺失数据的Dirichlet过程混合模型无监督学习
    Xunan Zhang
    Shiji Song
    Lei Zhu
    Keyou You
    Cheng Wu
    Science China Information Sciences, 2016, 59 : 1 - 14
  • [3] Dirichlet process mixture models for insurance loss data
    Hong, Liang
    Martin, Ryan
    SCANDINAVIAN ACTUARIAL JOURNAL, 2018, (06) : 545 - 554
  • [4] Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson's disease
    White, Nicole
    Johnson, Helen
    Silburn, Peter
    Mengersen, Kerrie
    JOURNAL OF APPLIED STATISTICS, 2012, 39 (11) : 2363 - 2377
  • [5] Learning process models with missing data
    Bridewell, Will
    Langley, Pat
    Racunas, Steve
    Borrett, Stuart
    MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 557 - 565
  • [6] Unsupervised learning of mixture regression models for longitudinal data
    Xu, Peirong
    Peng, Heng
    Huang, Tao
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 125 : 44 - 56
  • [7] Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data
    Dinari, Or
    Freifeld, Oren
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 818 - 835
  • [8] An optimal data ordering scheme for Dirichlet process mixture models
    Wang, Xue
    Walker, Stephen G.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 112 : 42 - 52
  • [9] CLASSIFICATION OF MULTIVARIATE DATA USING DIRICHLET PROCESS MIXTURE MODELS
    Djuric, Petar M.
    Ferrari, Andre
    2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 441 - 445
  • [10] Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
    Wang, Ruohui
    Lin, Dahua
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4632 - 4639