Comparison of Dimension Reduction Techniques on High Dimensional Datasets

被引:0
|
作者
Yildiz, Kazim [1 ]
Camurcu, Yilmaz [3 ]
Dogan, Buket [2 ]
机构
[1] Marmara Univ, Dept Comp Engn, Istanbul, Turkey
[2] Marmara Univ, Comp Engn Dept, Technol Fac, Istanbul, Turkey
[3] Fatih Sultan Mehmet Waqf Univ, Dept Comp Engn, Fatih Istanbul, Turkey
关键词
High dimensional data; clustering; dimensionality reduction; data mining; FUZZY C-MEANS; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensional data becomes very common with the rapid growth of data that has been stored in databases or other information areas. Thus clustering process became an urgent problem. The well-known clustering algorithms are not adequate for the high dimensional space because of the problem that is called curse of dimensionality. So dimensionality reduction techniques have been used for accurate clustering results and improve the clustering time in high dimensional space. In this work different dimensionality reduction techniques were combined with Fuzzy C-Means clustering algorithm. It is aimed to reduce the complexity of high dimensional datasets and to generate more accurate clustering results. The results were compared in terms of cluster purity, cluster entropy and mutual info. Dimension reduction techniques are compared with current Central Processing Unit (CPU), current memory and elapsed CPU time. The experiments showed that the proposed work produces promising results on high dimensional space.
引用
收藏
页码:256 / 262
页数:7
相关论文
共 50 条
  • [1] An Ant Colony Optimization Based Dimension Reduction Method for High-Dimensional Datasets
    Ying Li
    Gang Wang
    Huiling Chen
    Lian Shi
    Lei Qin
    [J]. Journal of Bionic Engineering, 2013, 10 : 231 - 241
  • [2] Novel Agglomerative Partitioning Framework for Dimension Reduction of High-Dimensional Genomic Datasets
    Millstein, Joshua
    Thomas, Duncan
    Yu, Yang
    Cozen, Wendy
    [J]. GENETIC EPIDEMIOLOGY, 2017, 41 (07) : 653 - 653
  • [3] A Projection Pursuit framework for supervised dimension reduction of high dimensional small sample datasets
    Espezua, Soledad
    Villanueva, Edwin
    Maciel, Carlos D.
    Carvalho, Andre
    [J]. NEUROCOMPUTING, 2015, 149 : 767 - 776
  • [4] An Ant Colony Optimization Based Dimension Reduction Method for High-Dimensional Datasets
    Li, Ying
    Wang, Gang
    Chen, Huiling
    Shi, Lian
    Qin, Lei
    [J]. JOURNAL OF BIONIC ENGINEERING, 2013, 10 (02) : 231 - 241
  • [5] Various dimension reduction techniques for high dimensional data analysis: a review
    Papia Ray
    S. Surender Reddy
    Tuhina Banerjee
    [J]. Artificial Intelligence Review, 2021, 54 : 3473 - 3515
  • [6] Various dimension reduction techniques for high dimensional data analysis: a review
    Ray, Papia
    Reddy, S. Surender
    Banerjee, Tuhina
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (05) : 3473 - 3515
  • [7] On the effects of dimension reduction techniques on some high-dimensional problems in finance
    Wang, Xiaoqun
    [J]. OPERATIONS RESEARCH, 2006, 54 (06) : 1063 - 1078
  • [8] Dimensionality Reduction Algorithms on High Dimensional Datasets
    Syarif, Iwan
    [J]. EMITTER-INTERNATIONAL JOURNAL OF ENGINEERING TECHNOLOGY, 2014, 2 (02) : 28 - 38
  • [9] Dimensionality Reduction for Classification Comparison of Techniques and Dimension Choice
    Plastria, Frank
    De Bruyne, Steven
    Carrizosa, Emilio
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 411 - +
  • [10] High dimensional structural reliability with dimension reduction
    Jiang, Zhongming
    Li, Jie
    [J]. STRUCTURAL SAFETY, 2017, 69 : 35 - 46