A Clustering-Based Approach to Reduce Feature Redundancy

被引:1
|
作者
de Amorim, Renato Cordeiro [1 ]
Mirkin, Boris [2 ]
机构
[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England
[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England
关键词
Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;
D O I
10.1007/978-3-319-19090-7_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.
引用
收藏
页码:465 / 475
页数:11
相关论文
共 50 条
  • [21] Unsupervised Feature Selection Based on Spectral Clustering with Maximum Relevancy and Minimum Redundancy Approach
    Khozaei, Bahareh
    Eftekhari, Mahdi
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (11)
  • [22] A Clustering-Based Approach to Kinetic Closest Pair
    Zahed Rahmati
    Timothy M. Chan
    [J]. Algorithmica, 2018, 80 : 2742 - 2756
  • [23] On hierarchical clustering-based approach for RDDBS design
    Hassan I. Abdalla
    Ali A. Amer
    Sri Devi Ravana
    [J]. Journal of Big Data, 10
  • [24] Clustering-based approach for medical data classification
    Kodabagi, Mallikarjun M.
    Tikotikar, Ahelam
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [25] Facial expression recognition: A clustering-based approach
    Chen, XW
    Huang, T
    [J]. PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) : 1295 - 1302
  • [26] Fuzzy Clustering-Based Approach for Outlier Detection
    Al-Zoubi, Moh'd Belal
    Ali, Al-Dahoud
    Yahya, Abdelfatah A.
    [J]. RECENT ADVANCES AND APPLICATIONS OF COMPUTER ENGINEERING: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE (ACE 10), 2010, : 192 - +
  • [27] A Clustering-Based Approach to the Mining of Analogical Proportions
    Beltran, William Correa
    Jaudoin, Helene
    Pivert, Olivier
    [J]. 2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 125 - 131
  • [28] A Clustering-Based Approach to Kinetic Closest Pair
    Rahmati, Zahed
    Chan, Timothy M.
    [J]. ALGORITHMICA, 2018, 80 (10) : 2742 - 2756
  • [29] A novel clustering-based approach to schema matching
    Pei, Jin
    Hong, Jun
    Bell, David
    [J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2006, 4243 : 60 - 69
  • [30] On hierarchical clustering-based approach for RDDBS design
    Abdalla, Hassan I.
    Amer, Ali A.
    Ravana, Sri Devi
    [J]. JOURNAL OF BIG DATA, 2023, 10 (01)