Unsupervised nested Dirichlet finite mixture model for clustering

被引：2

作者：

Alkhawaja, Fares ^{[1
]}

Bouguila, Nizar ^{[1
]}

机构：

[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ, Canada

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 21期

关键词：

Nested Dirichlet distribution; Dirichlet-tree distribution; Minimum message length; Finite mixtures; Hierarchical learning; GENERALIZED DIRICHLET; INFORMATION; FRAMEWORK;

D O I：

10.1007/s10489-023-04888-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Dirichlet distribution is widely used in the context of mixture models. Despite its flexibility, it still suffers from some limitations, such as its restrictive covariance matrix and its direct proportionality between its mean and variance. In this work, a generalization over the Dirichlet distribution, namely the Nested Dirichlet distribution, is introduced in the context of finite mixture model providing more flexibility and overcoming the mentioned drawbacks, thanks to its hierarchical structure. The model learning is based on the generalized expectation-maximization algorithm, where parameters are initialized with the method of moments and estimated through the iterative Newton-Raphson method. Moreover, the minimum message length criterion is proposed to determine the best number of components that describe the data clusters by the finite mixture model. The Nested Dirichlet distribution is proven to be part of the exponential family, which offers several advantages, such as the calculation of several probabilistic distances in closed forms. The performance of the Nested Dirichlet mixture model is compared to the Dirichlet mixture model, the generalized Dirichlet mixture model, and the Convolutional Neural Network as a deep learning network. The excellence of the powerful proposed framework is validated through this comparison via challenging datasets. The hierarchical feature of the model is applied to real-world challenging tasks such as hierarchical cluster analysis and hierarchical feature learning, showing a significant improvement in terms of accuracy.

引用

页码：25232 / 25258

页数：27

共 50 条

[1] Unsupervised nested Dirichlet finite mixture model for clustering
Fares Alkhawaja
Nizar Bouguila
[J]. Applied Intelligence, 2023, 53 : 25232 - 25258
[2] The nested joint clustering via Dirichlet process mixture model
Han, Shengtong
Zhang, Hongmei
Sheng, Wenhui
Arshad, Hasan
[J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (05) : 815 - 830
[3] Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application
Bouguila, N
Ziou, D
Vaillancourt, J
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (11) : 1533 - 1543
[4] Unsupervised selection of a finite Dirichlet mixture model: An MML-based approach
Bouguila, Nizar
Ziou, Djemel
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (08) : 993 - 1009
[5] Using unsupervised learning of a finite Dirichlet mixture model to improve pattern recognition applications
Bouguila, N
Ziou, D
[J]. PATTERN RECOGNITION LETTERS, 2005, 26 (12) : 1916 - 1925
[6] A powreful finite mixture model based on the generalized Dirichlet distribution: Unsupervised learning and applications
Bouguila, N
Ziou, D
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 280 - 283
[7] Unsupervised clustering and feature weighting based on Generalized Dirichlet mixture modeling
Ben Ismail, Mohamed Maher
Frigui, Hichem
[J]. INFORMATION SCIENCES, 2014, 274 : 35 - 54
[8] Unsupervised clustering using nonparametric finite mixture models
Hunter, David R. R.
[J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2024, 16 (01)
[9] Research on dirichlet process mixture model for clustering
Zhang, Biyao
Zhang, Kaisong
Zhong, Luo
Zhang, Xuanya
[J]. Ingenierie des Systemes d'Information, 2019, 24 (02): : 183 - 189
[10] Dirichlet process mixture models for unsupervised clustering of symptoms in Parkinson's disease
White, Nicole
Johnson, Helen
Silburn, Peter
Mengersen, Kerrie
[J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (11) : 2363 - 2377

← 1 2 3 4 5 →