Unsupervised learning of Dirichlet process mixture models with missing data

被引:1
|
作者
Zhang, Xunan [1 ]
Song, Shiji [1 ]
Zhu, Lei [2 ]
You, Keyou [1 ]
Wu, Cheng [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[2] China Ocean Mineral Resources R&D Assoc, Beijing 100860, Peoples R China
基金
中国国家自然科学基金;
关键词
Dirichlet processes; missing data; clustering; variational Bayesian; image analysis; INCOMPLETE DATA; CLASSIFICATION; ALGORITHM; SELECTION; PRIORS;
D O I
10.1007/s11432-015-5429-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study presents a novel approach to unsupervised learning for clustering with missing data. We first extend a finite mixture model to the infinite case by considering Dirichlet process mixtures, which can automatically determine the number of mixture components or clusters. Furthermore, we view the missing features as latent variables and compute the posterior distributions using the variational Bayesian expectation maximization algorithm, which optimizes the evidence lower bound on the complete-data log marginal likelihood. We demonstrate the performance on several artificial data sets with missing values. The experimental results indicate that the proposed method outperforms some classic imputation methods. We finally present an application to seabed hydrothermal sulfide color images analysis problem.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] Unsupervised learning of Dirichlet process mixture models with missing data
    Xunan ZHANG
    Shiji SONG
    Lei ZHU
    Keyou YOU
    Cheng WU
    [J]. Science China(Information Sciences), 2016, 59 (01) : 161 - 174
  • [2] Unsupervised learning of Dirichlet process mixture models with missing data面向缺失数据的Dirichlet过程混合模型无监督学习
    Xunan Zhang
    Shiji Song
    Lei Zhu
    Keyou You
    Cheng Wu
    [J]. Science China Information Sciences, 2016, 59 : 1 - 14
  • [3] Dirichlet process mixture models for insurance loss data
    Hong, Liang
    Martin, Ryan
    [J]. SCANDINAVIAN ACTUARIAL JOURNAL, 2018, (06) : 545 - 554
  • [4] Learning process models with missing data
    Bridewell, Will
    Langley, Pat
    Racunas, Steve
    Borrett, Stuart
    [J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 557 - 565
  • [5] Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson's disease
    White, Nicole
    Johnson, Helen
    Silburn, Peter
    Mengersen, Kerrie
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (11) : 2363 - 2377
  • [6] Unsupervised learning of mixture regression models for longitudinal data
    Xu, Peirong
    Peng, Heng
    Huang, Tao
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 125 : 44 - 56
  • [7] Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data
    Dinari, Or
    Freifeld, Oren
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 818 - 835
  • [8] An optimal data ordering scheme for Dirichlet process mixture models
    Wang, Xue
    Walker, Stephen G.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 112 : 42 - 52
  • [9] CLASSIFICATION OF MULTIVARIATE DATA USING DIRICHLET PROCESS MIXTURE MODELS
    Djuric, Petar M.
    Ferrari, Andre
    [J]. 2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 441 - 445
  • [10] Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
    Wang, Ruohui
    Lin, Dahua
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4632 - 4639