A Novel Approach for Clustering High-Dimensional Data using Kernel Hubness

被引:0
|
作者
Amina, M. [1 ]
Farook, Syed K. [1 ]
机构
[1] MES Coll Engn, Comp Sci & Engn Dept, Kuttippuram, Kerala, India
关键词
Clustering; High dimensional clustering; Hub based clustering; Kernal;
D O I
10.1109/ICACC.2015.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming very tedious process. The key disadvantage of high dimensional data which we can pen down is curse of dimensionality. As the magnitude of datasets grows the data points become sparse and density of area becomes less making it difficult to cluster that data which further reduces the performance of traditional algorithms used for clustering. To route these toils, hubness based algorithms were introduced. These algorithms which influences the distribution of the data points among the k-nearest neighbor. The hubness is an unguided method which finds out which points appear more frequently in the k-nearest neighbor than other points in the dataset. Mainly three algorithms are used for hub based clustering such as K-hubs, Hubness proportional clustering and Hubness proportional K-means. K-hubs algorithm is used to initialize the hubs for the clusters. Hubness Proportional Clustering (HPC) algorithm is used group the probabilistic data models. Hubness Proportional K-Means (HPKM) algorithm integrates the hubness based centroid selection and partitioning process. These algorithms are basically used for increasing the efficiency and increasing predicting accuracy of the system. The main drawback of in this method is number of iteration increasing with dimensionality is increased. To overcome this drawback a new algorithm is proposed which is based on the combination of kernel mapping and hubness phenomenon. The proposed algorithm detects arbitrary shaped clusters in the dataset and also improves the performance of clustering by minimizing the intra-cluster distance and maximizing the inter-cluster distance which improves the cluster quality.
引用
收藏
页码:94 / 97
页数:4
相关论文
共 50 条
  • [21] Compressive Clustering of High-dimensional Data
    Ruta, Andrzej
    Porikli, Fatih
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 380 - 385
  • [22] Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data
    Xia, Hu
    Zhuang, Jian
    Yu, Dehong
    PATTERN RECOGNITION, 2013, 46 (09) : 2562 - 2575
  • [23] Fuzzy Clustering High-Dimensional Data Using Information Weighting
    Bodyanskiy, Yevgeniy V.
    Tyshchenko, Oleksii K.
    Mashtalir, Sergii V.
    ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 385 - 395
  • [24] Sparse Kernel Clustering of Massive High-Dimensional Data sets with Large Number of Clusters
    Chitta, Radha
    Jain, Anil K.
    Jin, Rong
    PIKM'15: PROCEEDINGS OF THE 8TH PH.D. WORKSHOP IN INFORMATION AND KNOWLEDGE MANAGEMENT, 2015, : 11 - 18
  • [25] Robust Local Triangular Kernel Density-based Clustering for High-dimensional Data
    Musdholifah, Aina
    Hashim, Siti Zaiton Mohd
    2013 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2013, : 24 - 32
  • [26] KNN-kernel density-based clustering for high-dimensional multivariate data
    Tran, Thanh N.
    Wehrens, Ron
    Buydens, Lutgarde M. C.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (02) : 513 - 525
  • [27] Discriminative Clustering of High-Dimensional Data Using Generative Modeling
    Abdi, Masoud
    Lim, Chee Peng
    Mohamed, Shady
    Abbasnejad, Saeid Nahavandi Ehsan
    Van Den Hengel, Anton
    2018 IEEE 61ST INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2018, : 799 - 802
  • [28] RETRACTED: An Ensemble Clustering Approach (Consensus Clustering) for High-Dimensional Data (Retracted Article)
    Yan, Jingdong
    Liu, Wuwei
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [29] A multistage mathematical approach to automated clustering of high-dimensional noisy data
    Friedman, Alexander
    Keselman, Michael D.
    Gibb, Leif G.
    Graybiel, Ann M.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (14) : 4477 - 4482
  • [30] A novel algorithm for fast and scalable subspace clustering of high-dimensional data
    Kaur A.
    Datta A.
    Journal of Big Data, 2015, 2 (01)