Unsupervised feature selection using feature similarity

被引:955
|
作者
Mitra, P [1 ]
Murthy, CA [1 ]
Pal, SK [1 ]
机构
[1] Indian Stat Inst, Inst Machine Intelligence, Kolkata 700035, W Bengal, India
关键词
data mining; pattern recognition; dimensionality reduction; feature clustering; multiscale representation; entropy measures;
D O I
10.1109/34.990133
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, therefore, is fast. A new feature similarity measure, called maximum information compression index, is introduced. The algorithm is generic in nature and has the capability of multiscale representation of data sets. The superiority of the algorithm, in terms of speed and performance, is established extensively over various real-life data sets of different sizes and dimensions. It is also demonstrated how redundancy and information loss in feature selection can be quantified with an entropy measure.
引用
收藏
页码:301 / 312
页数:12
相关论文
共 50 条
  • [31] Multi-View Unsupervised Feature Selection with Adaptive Similarity and View Weight
    Hou, Chenping
    Nie, Feiping
    Tao, Hong
    Yi, Dongyun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (09) : 1998 - 2011
  • [32] Adaptive Collaborative Similarity Learning for Unsupervised Multi-view Feature Selection
    Dong, Xiao
    Zhu, Lei
    Song, Xuemeng
    Li, Jingjing
    Cheng, Zhiyong
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2064 - 2070
  • [33] Feature selection based on similarity
    Lazzerini, B
    Marcelloni, F
    [J]. ELECTRONICS LETTERS, 2002, 38 (03) : 121 - 122
  • [34] On Similarity Preserving Feature Selection
    Zhao, Zheng
    Wang, Lei
    Liu, Huan
    Ye, Jieping
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (03) : 619 - 632
  • [35] Unsupervised feature selection guided by orthogonal representation of feature space
    Jahani, Mahsa Samareh
    Aghamollaei, Gholamreza
    Eftekhari, Mahdi
    Saberi-Movahed, Farid
    [J]. NEUROCOMPUTING, 2023, 516 : 61 - 76
  • [36] An efficient unsupervised feature selection procedure through feature clustering
    Yan, Xuyang
    Nazmi, Shabnam
    Erol, Berat A.
    Homaifar, Abdollah
    Gebru, Biniam
    Tunstel, Edward
    [J]. PATTERN RECOGNITION LETTERS, 2020, 131 : 277 - 284
  • [37] Similarity Searching for Potent Compounds Using Feature Selection
    Vogt, Martin
    Bajorath, Juergen
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (07) : 1613 - 1619
  • [38] Instance Similarity Learning for Unsupervised Feature Representation
    Wang, Ziwei
    Wang, Yunsong
    Wu, Ziyi
    Lu, Jiwen
    Zhou, Jie
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10316 - 10325
  • [39] Unsupervised feature selection for balanced clustering
    Zhou, Peng
    Chen, Jiangyong
    Fan, Mingyu
    Du, Liang
    Shen, Yi-Dong
    Li, Xuejun
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 193
  • [40] A review of unsupervised feature selection methods
    Saúl Solorio-Fernández
    J. Ariel Carrasco-Ochoa
    José Fco. Martínez-Trinidad
    [J]. Artificial Intelligence Review, 2020, 53 : 907 - 948