A Probabilistic Distance Clustering Algorithm Using Gaussian and Student-t Multivariate Density Distributions

被引:0
|
作者
Tortora C. [1 ]
McNicholas P.D. [2 ]
Palumbo F. [3 ]
机构
[1] Department of Mathematics and Statistics, San José State University, San José, CA
[2] Department of Mathematics and Statistics, McMaster University, Hamilton, ON
[3] Dipartimento di Scienze Politiche, University of Naples Federico II, Naples
基金
加拿大自然科学与工程研究理事会;
关键词
Cluster analysis; Dissimilarity measures; Multivariate distributions; PD-clustering;
D O I
10.1007/s42979-020-0067-z
中图分类号
学科分类号
摘要
A new dissimilarity measure for cluster analysis is presented and used in the context of probabilistic distance (PD) clustering. The basic assumption of PD-clustering is that for each unit, the product between the probability of the unit belonging to a cluster and the distance between the unit and the cluster is constant. This constant is a measure of the classifiability of the point, and the sum of the constant over units is called joint distance function (JDF). The parameters that minimize the JDF maximize the classifiability of the units. The new dissimilarity measure is based on the use of symmetric density functions and allows the method to find clusters characterized by different variances and correlation among variables. The multivariate Gaussian and the multivariate Student-t distributions have been used, outperforming classical PD clustering, and its variation PD clustering adjusted for cluster size, on simulated and real datasets. © 2020, Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [1] Student-t variational autoencoder for robust multivariate density estimation
    Takahashi, Hiroshi
    Iwata, Tomoharu
    Yamanaka, Yuki
    Yamada, Masanori
    Yagi, Satoshi
    Kashima, Hisashi
    [J]. Transactions of the Japanese Society for Artificial Intelligence, 2021, 36 (03)
  • [2] Multivariate Myriad Filters Based on Parameter Estimation of Student-t Distributions
    Laus, Friederike
    Steidl, Gabriele
    [J]. SIAM JOURNAL ON IMAGING SCIENCES, 2019, 12 (04): : 1864 - 1904
  • [3] Distance and density based clustering algorithm using Gaussian kernel
    Gungor, Emre
    Ozmen, Ahmet
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 69 : 10 - 20
  • [4] Multivariate Gaussian and Student-t process regression for multi-output prediction
    Chen, Zexun
    Wang, Bo
    Gorban, Alexander N.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (08): : 3005 - 3028
  • [5] Multivariate Gaussian and Student-t process regression for multi-output prediction
    Zexun Chen
    Bo Wang
    Alexander N. Gorban
    [J]. Neural Computing and Applications, 2020, 32 : 3005 - 3028
  • [6] On moments of folded and truncated multivariate Student-t distributions based on recurrence relations
    Galarza, Christian E.
    Lin, Tsung-, I
    Wang, Wan-Lun
    Lachos, Victor H.
    [J]. METRIKA, 2021, 84 (06) : 825 - 850
  • [7] On moments of folded and truncated multivariate Student-t distributions based on recurrence relations
    Christian E. Galarza
    Tsung-I Lin
    Wan-Lun Wang
    Víctor H. Lachos
    [J]. Metrika, 2021, 84 : 825 - 850
  • [8] Bayesian QTL mapping using skewed Student-t distributions
    von Rohr, P
    Hoeschele, I
    [J]. GENETICS SELECTION EVOLUTION, 2002, 34 (01) : 1 - 21
  • [9] Bayesian QTL mapping using skewed Student-t distributions
    Peter von Rohr
    Ina Hoeschele
    [J]. Genetics Selection Evolution, 34
  • [10] Correction to: Multivariate Gaussian and Student-t process regression for multi-output prediction
    Zexun Chen
    Bo Wang
    Alexander N. Gorban
    [J]. Neural Computing and Applications, 2020, 32 : 11963 - 11963