A Novel Approach for Clustering High-Dimensional Data using Kernel Hubness

被引:0
|
作者
Amina, M. [1 ]
Farook, Syed K. [1 ]
机构
[1] MES Coll Engn, Comp Sci & Engn Dept, Kuttippuram, Kerala, India
关键词
Clustering; High dimensional clustering; Hub based clustering; Kernal;
D O I
10.1109/ICACC.2015.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming very tedious process. The key disadvantage of high dimensional data which we can pen down is curse of dimensionality. As the magnitude of datasets grows the data points become sparse and density of area becomes less making it difficult to cluster that data which further reduces the performance of traditional algorithms used for clustering. To route these toils, hubness based algorithms were introduced. These algorithms which influences the distribution of the data points among the k-nearest neighbor. The hubness is an unguided method which finds out which points appear more frequently in the k-nearest neighbor than other points in the dataset. Mainly three algorithms are used for hub based clustering such as K-hubs, Hubness proportional clustering and Hubness proportional K-means. K-hubs algorithm is used to initialize the hubs for the clusters. Hubness Proportional Clustering (HPC) algorithm is used group the probabilistic data models. Hubness Proportional K-Means (HPKM) algorithm integrates the hubness based centroid selection and partitioning process. These algorithms are basically used for increasing the efficiency and increasing predicting accuracy of the system. The main drawback of in this method is number of iteration increasing with dimensionality is increased. To overcome this drawback a new algorithm is proposed which is based on the combination of kernel mapping and hubness phenomenon. The proposed algorithm detects arbitrary shaped clusters in the dataset and also improves the performance of clustering by minimizing the intra-cluster distance and maximizing the inter-cluster distance which improves the cluster quality.
引用
收藏
页码:94 / 97
页数:4
相关论文
共 50 条
  • [41] Clustering of imbalanced high-dimensional media data
    Šárka Brodinová
    Maia Zaharieva
    Peter Filzmoser
    Thomas Ortner
    Christian Breiteneder
    Advances in Data Analysis and Classification, 2018, 12 : 261 - 284
  • [42] An effective clustering scheme for high-dimensional data
    Xuansen He
    Fan He
    Yueping Fan
    Lingmin Jiang
    Runzong Liu
    Allam Maalla
    Multimedia Tools and Applications, 2024, 83 : 45001 - 45045
  • [43] An algorithm for high-dimensional traffic data clustering
    Zheng, Pengjun
    McDonald, Mike
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 59 - 68
  • [44] Subspace Clustering for High-Dimensional Data Using Cluster Structure Similarity
    Fatehi, Kavan
    Rezvani, Mohsen
    Fateh, Mansoor
    Pajoohan, Mohammad-Reza
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2018, 14 (03) : 38 - 55
  • [45] Supervised clustering of high-dimensional data using regularized mixture modeling
    Chang, Wennan
    Wan, Changlin
    Zang, Yong
    Zhang, Chi
    Cao, Sha
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [46] High-dimensional data clustering by using local affine/convex hulls
    Cevikalp, Hakan
    PATTERN RECOGNITION LETTERS, 2019, 128 : 427 - 432
  • [47] Visualization of high-dimensional data using an association of multidimensional scaling to clustering
    Naud, A
    2004 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2004, : 252 - 255
  • [48] Sparse kernel methods for high-dimensional survival data
    Evers, Ludger
    Messow, Claudia-Martina
    BIOINFORMATICS, 2008, 24 (14) : 1632 - 1638
  • [49] An Improved Multi-Objective Evolutionary Approach for Clustering High-Dimensional Data
    Liu, Chao
    Zhao, Qi
    Yan, Bai
    Elsayed, Saber
    Sarker, Ruhul
    2018 IEEE/ACM 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING APPLICATIONS AND TECHNOLOGIES (BDCAT), 2018, : 184 - 190
  • [50] A comprehensive empirical comparison of hubness reduction in high-dimensional spaces
    Feldbauer, Roman
    Flexer, Arthur
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (01) : 137 - 166