Data Clustering using Online Variational Learning of Finite Scaled Dirichlet Mixture Models

被引:3
|
作者
Nguyen, Hieu [1 ]
Kalra, Meeta [1 ]
Azam, Muhammad [2 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] Concordia Univ, Dept Elect & Comp Engn, Montreal, PQ H3G 1M8, Canada
来源
2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019) | 2019年
关键词
Online learning; Variational inference; Finite mixture model; Scaled Dirichlet distribution; Unsupervised learning; Spam detection; Image clustering;
D O I
10.1109/IRI.2019.00050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With a massive amount of data created on a daily basis, the ubiquitous demand for data analysis is obvious. Recent development of technology has made machine learning techniques applicable to various problems. In this paper, we emphasize on cluster analysis, an important aspect of data analysis. In other words, being able to automatically discover different groups containing similar data is crucial for further information retrieving and anomaly detection tasks. Thus, we propose an online variational inference framework for finite Scaled Dirichlet mixture models. By efficiently handling large scale data, online approach is capable of enhancing the scalability of finite mixture models for demanding applications in real time. The proposed method can simultaneously update the model's parameters and determine the optimal number of components without the complex computation of conventional Bayesian algorithm. The effectiveness of our model is affirmed with challenging problems including spam detection and image clustering.
引用
收藏
页码:267 / 274
页数:8
相关论文
共 50 条
  • [21] Human Action Recognition using Accelerated Variational Learning of Infinite Dirichlet Mixture Models
    Fan, Wentao
    Sallay, Hassen
    Bouguila, Nizar
    Du, Ji-Xiang
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 451 - 456
  • [22] Simultaneous Bayesian clustering and feature selection using RJMCMC-based learning of finite generalized Dirichlet mixture models
    Elguebaly, Tarek
    Bouguila, Nizar
    SIGNAL PROCESSING, 2013, 93 (06) : 1531 - 1546
  • [23] Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting
    Maanicshah, Kamal
    Ali, Samr
    Fan, Wentao
    Bouguila, Nizar
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2019), PT II, 2019, 11663 : 94 - 105
  • [24] Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data
    Dinari, Or
    Freifeld, Oren
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 818 - 835
  • [25] Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering
    Li C.
    Rana S.
    Phung D.
    Venkatesh S.
    Annals of Data Science, 2016, 3 (2) : 205 - 223
  • [26] Clustering compositional data using Dirichlet mixture model
    Pal, Samyajoy
    Heumann, Christian
    PLOS ONE, 2022, 17 (05):
  • [27] Variational learning of hierarchical infinite generalized Dirichlet mixture models and applications
    Wentao Fan
    Hassen Sallay
    Nizar Bouguila
    Sami Bourouis
    Soft Computing, 2016, 20 : 979 - 990
  • [28] Variational learning of hierarchical infinite generalized Dirichlet mixture models and applications
    Fan, Wentao
    Sallay, Hassen
    Bouguila, Nizar
    Bourouis, Sami
    SOFT COMPUTING, 2016, 20 (03) : 979 - 990
  • [29] Proportional data modeling via selection and estimation of a finite mixture of scaled Dirichlet distributions
    Zamzami, Nuha
    Alsuroji, Rua
    Eromonsele, Oboh
    Bouguila, Nizar
    COMPUTATIONAL INTELLIGENCE, 2020, 36 (02) : 459 - 485
  • [30] DIRICHLET PROCESS MIXTURE MODELS FOR CLUSTERING I-VECTOR DATA
    Seshadri, Shreyas
    Remes, Ulpu
    Rasanen, Okko
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5470 - 5474