A new unsupervised feature selection algorithm using similarity-based feature clustering

被引:32
|
作者
Zhu, Xiaoyan [1 ]
Wang, Yu [1 ]
Li, Yingbin [1 ]
Tan, Yonghui [1 ]
Wang, Guangtao [2 ]
Song, Qinbao [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian, Shaanxi, Peoples R China
[2] JD AI Res, Mountain View, CA USA
基金
中国国家自然科学基金;
关键词
clustering; feature selection; feature similarity; CLASSIFICATION;
D O I
10.1111/coin.12192
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised feature selection is an important problem, especially for high-dimensional data. However, until now, it has been scarcely studied and the existing algorithms cannot provide satisfying performance. Thus, in this paper, we propose a new unsupervised feature selection algorithm using similarity-based feature clustering, Feature Selection-based Feature Clustering (FSFC). FSFC removes redundant features according to the results of feature clustering based on feature similarity. First, it clusters the features according to their similarity. A new feature clustering algorithm is proposed, which overcomes the shortcomings of K-means. Second, it selects a representative feature from each cluster, which contains most interesting information of features in the cluster. The efficiency and effectiveness of FSFC are tested upon real-world data sets and compared with two representative unsupervised feature selection algorithms, Feature Selection Using Similarity (FSUS) and Multi-Cluster-based Feature Selection (MCFS) in terms of runtime, feature compression ratio, and the clustering results of K-means. The results show that FSFC can not only reduce the feature space in less time, but also significantly improve the clustering performance of K-means.
引用
下载
收藏
页码:2 / 22
页数:21
相关论文
共 50 条
  • [1] Unsupervised similarity-based feature selection using heuristic Hopfield neural networks
    Shi, SYM
    Suganthan, PN
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1838 - 1843
  • [2] Unsupervised feature selection using feature similarity
    Mitra, P
    Murthy, CA
    Pal, SK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (03) : 301 - 312
  • [3] Unsupervised Feature Selection Algorithm Based on Similarity Matrix
    Gan, Wenya
    Ling, You
    Huang, Yuanling
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 5 - 11
  • [4] An Unsupervised Attribute Clustering Algorithm for Unsupervised Feature Selection
    Zhou, Pei-Yuan
    Chan, Keith C. C.
    PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 710 - 716
  • [5] Predictor output sensitivity and feature similarity-based feature selection
    Verikas, A.
    Bacauskiene, M.
    Valincius, D.
    Gelzinis, A.
    FUZZY SETS AND SYSTEMS, 2008, 159 (04) : 422 - 434
  • [6] Unsupervised feature selection based on adaptive similarity learning and subspace clustering
    Parsa, Mohsen Ghassemi
    Zare, Hadi
    Ghatee, Mehdi
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95 (95)
  • [7] Unsupervised Feature Selection with Feature Clustering
    Cheung, Yiu-ming
    Jia, Hong
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 9 - 15
  • [8] Similarity-based constraint score for feature selection
    Salmi, Abderezak
    Hammouche, Kamal
    Macaire, Ludovic
    KNOWLEDGE-BASED SYSTEMS, 2020, 209
  • [9] Dynamic Feature Selection Based on Clustering Algorithm and Individual Similarity
    Dantas, Carine A.
    Nunes, Romulo O.
    Canuto, Anne M. P.
    Xavier-Junior, Joao C.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 467 - 474
  • [10] Clustering algorithm using rough set theory for unsupervised feature selection
    Pacheco, Fannia
    Cerrada, Mariela
    Li, Chuan
    Sanchez, Rene Vinicio
    Cabrera, Diego
    de Oliveira, Jose Valente
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3493 - 3499