A Clustering-Based Approach to Reduce Feature Redundancy

被引:1
|
作者
de Amorim, Renato Cordeiro [1 ]
Mirkin, Boris [2 ]
机构
[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England
[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England
关键词
Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;
D O I
10.1007/978-3-319-19090-7_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.
引用
收藏
页码:465 / 475
页数:11
相关论文
共 50 条
  • [31] An Experimental Study on Unsupervised Clustering-based Feature Selection Methods
    Covoes, Thiago F.
    Hruschka, Eduardo R.
    [J]. 2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 993 - 1000
  • [32] Feature Selection and Overlapping Clustering-Based Multilabel Classification Model
    Peng, Liwen
    Liu, Yongguo
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [33] Clustering-based Feature Selection in Semi-supervised Problems
    Quinzan, Ianisse
    Sotoca, Jose M.
    Pla, Filiberto
    [J]. 2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 535 - 540
  • [34] A Novel Clustering-Based Feature Representation for the Classification of Hyperspectral Imagery
    Lu, Qikai
    Huang, Xin
    Zhang, Liangpei
    [J]. REMOTE SENSING, 2014, 6 (06) : 5732 - 5753
  • [35] ClusterCNN: Clustering-Based Feature Learning for Hyperspectral Image Classification
    Yao, Wei
    Lian, Cheng
    Bruzzone, Lorenzo
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) : 1991 - 1995
  • [36] Clustering-Based Feature Selection for Content Based Remote Sensing Image Retrieval
    Li, Shijin
    Zhu, Jiali
    Feng, Jun
    Wan, Dingsheng
    [J]. IMAGE ANALYSIS AND RECOGNITION, PT I, 2012, 7324 : 427 - 435
  • [37] A Genetics Clustering-based Approach for Weblog Data Cleaning
    Ganibardi, Amine
    Ali, Cherif Arab
    [J]. 2018 SIXTH INTERNATIONAL CONFERENCE ON ENTERPRISE SYSTEMS (ES 2018), 2018, : 75 - 81
  • [38] Graph clustering-based discretization approach to microarray data
    Kittakorn Sriwanna
    Tossapon Boongoen
    Natthakan Iam-On
    [J]. Knowledge and Information Systems, 2019, 60 : 879 - 906
  • [39] Cognitive Profiling for Job Recruitments: A Clustering-Based Approach
    Verma, Asmita
    Deep, Prakhar
    Aman, Kushagra
    Khemchandani, Vineeta
    Chandra, Sushil
    Sharma, Greeshma
    [J]. 2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021, : 604 - 608
  • [40] A Novel Clustering-Based Approach of Indoor Location Fingerprinting
    Lee, Chung-Wei
    Lin, Tsung-Nan
    Fang, Shih-Hau
    Chou, Yen-Chih
    [J]. 2013 IEEE 24TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2013, : 3191 - 3196