Clustering High Dimensional Data Using RIA

被引:0
|
作者
Aziz, Nazrina [1 ]
机构
[1] Univ Utara Malaysia, Coll Arts & Sci, Sch Quantitat Sci, Sintok 06010, Kedah, Malaysia
关键词
high dimensional data; covariance matrix; eigenstructure; dissimilarity measure;
D O I
10.1063/1.4915706
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Clustering may simply represent a convenient method for organizing a large data set so that it can easily be understood and information can efficiently be retrieved. However, identifying cluster in high dimensionality data sets is a difficult task because of the curse of dimensionality. Another challenge in clustering is some traditional functions cannot capture the pattern dissimilarity among objects. In this article, we used an alternative dissimilarity measurement called Robust Influence Angle (RIA) in the partitioning method. RIA is developed using eigenstructure of the covariance matrix and robust principal component score. We notice that, it can obtain cluster easily and hence avoid the curse of dimensionality. It is also manage to cluster large data sets with mixed numeric and categorical value.
引用
收藏
页数:3
相关论文
共 50 条
  • [1] Clustering high dimensional data using SVM
    Lin, Tsau Young
    Ngo, Tam
    [J]. ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2007, 4482 : 256 - +
  • [2] Clustering for High Dimensional Data
    Sharma, Varun Kumar
    Bala, Anju
    [J]. 2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 365 - 369
  • [3] Clustering high dimensional data
    Assent, Ira
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (04) : 340 - 350
  • [4] Analyzing High Dimensional Toxicogenomic Data Using Consensus Clustering
    Gao, Ce
    Weisman, David
    Gou, Na
    Ilyin, Valentine
    Gu, April Z.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2012, 46 (15) : 8413 - 8421
  • [5] Robust clustering in high dimensional data using statistical depths
    Ding, Yuanyuan
    Dang, Xin
    Peng, Hanxiang
    Wilkins, Dawn
    [J]. BMC BIOINFORMATICS, 2007, 8 (Suppl 7)
  • [6] Clustering high-dimensional data using growing SOM
    Zhou, JL
    Fu, Y
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 2, PROCEEDINGS, 2005, 3497 : 63 - 68
  • [7] Robust clustering in high dimensional data using statistical depths
    Yuanyuan Ding
    Xin Dang
    Hanxiang Peng
    Dawn Wilkins
    [J]. BMC Bioinformatics, 8
  • [8] Clustering High-Dimensional Stock Data using Data Mining Approach
    Indriyanti, Dhea
    Dhini, Arian
    [J]. 2019 16TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM2019), 2019,
  • [9] Subspace clustering of high dimensional data
    Domeniconi, C
    Papadopoulos, D
    Gunopulos, D
    Ma, S
    [J]. Proceedings of the Fourth SIAM International Conference on Data Mining, 2004, : 517 - 521
  • [10] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519