Unsupervised feature selection using feature similarity

被引:955
|
作者
Mitra, P [1 ]
Murthy, CA [1 ]
Pal, SK [1 ]
机构
[1] Indian Stat Inst, Inst Machine Intelligence, Kolkata 700035, W Bengal, India
关键词
data mining; pattern recognition; dimensionality reduction; feature clustering; multiscale representation; entropy measures;
D O I
10.1109/34.990133
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, therefore, is fast. A new feature similarity measure, called maximum information compression index, is introduced. The algorithm is generic in nature and has the capability of multiscale representation of data sets. The superiority of the algorithm, in terms of speed and performance, is established extensively over various real-life data sets of different sizes and dimensions. It is also demonstrated how redundancy and information loss in feature selection can be quantified with an entropy measure.
引用
收藏
页码:301 / 312
页数:12
相关论文
共 50 条
  • [1] A new unsupervised feature selection algorithm using similarity-based feature clustering
    Zhu, Xiaoyan
    Wang, Yu
    Li, Yingbin
    Tan, Yonghui
    Wang, Guangtao
    Song, Qinbao
    [J]. COMPUTATIONAL INTELLIGENCE, 2019, 35 (01) : 2 - 22
  • [2] Unsupervised feature selection using feature similarity (vol 24, pg 301, 2002)
    Mitra, P
    Murthy, CA
    Pal, SK
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (06) : 721 - 721
  • [3] Unsupervised authorship attribution using feature selection and weighted cosine similarity
    Martin-del-Campo-Rodriguez, Carolina
    Sidorov, Grigori
    Batyrshin, Ildar
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4357 - 4367
  • [4] Unsupervised Feature Selection Algorithm Based on Similarity Matrix
    Gan, Wenya
    Ling, You
    Huang, Yuanling
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 5 - 11
  • [5] Unsupervised Feature Selection with Feature Clustering
    Cheung, Yiu-ming
    Jia, Hong
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 9 - 15
  • [6] Similarity Preserving Unsupervised Feature Selection based on Sparse Learning
    Zare, Hadi
    Parsa, Mohsen Ghasemi
    Ghatee, Mehdi
    Alizadeh, Sasan H.
    [J]. 2020 10TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2020, : 50 - 55
  • [7] Unsupervised feature selection with high-order similarity learning
    Mi, Yong
    Chen, Hongmei
    Luo, Chuan
    Horng, Shi-Jinn
    Li, Tianrui
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [8] Feature selection using structural similarity
    Mitra, Sushmita
    Kundu, Partha Pratim
    Pedrycz, Witold
    [J]. INFORMATION SCIENCES, 2012, 198 : 48 - 61
  • [9] Unsupervised similarity-based feature selection using heuristic Hopfield neural networks
    Shi, SYM
    Suganthan, PN
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1838 - 1843
  • [10] Feature weighting as a tool for unsupervised feature selection
    Panday, Deepak
    de Amorim, Renato Cordeiro
    Lane, Peter
    [J]. INFORMATION PROCESSING LETTERS, 2018, 129 : 44 - 52