A Novel Approach for Clustering High-Dimensional Data using Kernel Hubness

被引：0

作者：

Amina, M. ^{[1
]}

Farook, Syed K. ^{[1
]}

机构：

[1] MES Coll Engn, Comp Sci & Engn Dept, Kuttippuram, Kerala, India

来源：

2015 Fifth International Conference on Advances in Computing and Communications (ICACC) | 2015年

关键词：

Clustering; High dimensional clustering; Hub based clustering; Kernal;

D O I：

10.1109/ICACC.2015.67

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Clustering of high dimensionality data which can be seen in almost all fields these days is becoming very tedious process. The key disadvantage of high dimensional data which we can pen down is curse of dimensionality. As the magnitude of datasets grows the data points become sparse and density of area becomes less making it difficult to cluster that data which further reduces the performance of traditional algorithms used for clustering. To route these toils, hubness based algorithms were introduced. These algorithms which influences the distribution of the data points among the k-nearest neighbor. The hubness is an unguided method which finds out which points appear more frequently in the k-nearest neighbor than other points in the dataset. Mainly three algorithms are used for hub based clustering such as K-hubs, Hubness proportional clustering and Hubness proportional K-means. K-hubs algorithm is used to initialize the hubs for the clusters. Hubness Proportional Clustering (HPC) algorithm is used group the probabilistic data models. Hubness Proportional K-Means (HPKM) algorithm integrates the hubness based centroid selection and partitioning process. These algorithms are basically used for increasing the efficiency and increasing predicting accuracy of the system. The main drawback of in this method is number of iteration increasing with dimensionality is increased. To overcome this drawback a new algorithm is proposed which is based on the combination of kernel mapping and hubness phenomenon. The proposed algorithm detects arbitrary shaped clusters in the dataset and also improves the performance of clustering by minimizing the intra-cluster distance and maximizing the inter-cluster distance which improves the cluster quality.

引用

页码：94 / 97

页数：4

共 50 条

[31] A novel attribute weighting algorithm for clustering high-dimensional categorical data
Bai, Liang
Liang, Jiye
Dang, Chuangyin
Cao, Fuyuan
PATTERN RECOGNITION, 2011, 44 (12) : 2843 - 2861
[32] A kernel-based approach for detecting outliers of high-dimensional biological data
Jung Hun Oh
Jean Gao
BMC Bioinformatics, 10
[33] A kernel-based approach for detecting outliers of high-dimensional biological data
Oh, Jung Hun
Gao, Jean
BMC BIOINFORMATICS, 2009, 10
[34] Hubness-aware kNN classification of high-dimensional data in presence of label noise
Tomasev, Nenad
Buza, Krisztian
NEUROCOMPUTING, 2015, 160 : 157 - 172
[35] An effective clustering scheme for high-dimensional data
He, Xuansen
He, Fan
Fan, Yueping
Jiang, Lingmin
Liu, Runzong
Maalla, Allam
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (15) : 45001 - 45045
[36] Approximated clustering of distributed high-dimensional data
Kriegel, HP
Kunath, P
Pfeifle, M
Renz, M
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 432 - 441
[37] Clustering High-Dimensional Noisy Categorical Data
Tian, Zhiyi
Xu, Jiaming
Tang, Jen
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
[38] Subspace selection for clustering high-dimensional data
Baumgartner, C
Plant, C
Kailing, K
Kriegel, HP
Kröger, P
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 11 - 18
[39] An Initialization Method for Clustering High-Dimensional Data
Chen, Luying
Chen, Lifei
Jiang, Qingshan
Wang, Beizhan
Shi, Liang
FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 444 - +
[40] Clustering of imbalanced high-dimensional media data
Brodinova, Sarka
Zaharieva, Maia
Filzmoser, Peter
Ortner, Thomas
Breiteneder, Christian
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (02) : 261 - 284

← 1 2 3 4 5 →