Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction

被引:59
|
作者
Li, Mengmeng [1 ,2 ]
Wang, Haofeng [1 ,2 ]
Yang, Lifang [1 ,2 ]
Liang, You [1 ,2 ]
Shang, Zhigang [1 ,2 ,3 ]
Wan, Hong [1 ,2 ,3 ]
机构
[1] Zhengzhou Univ, Sch Elect Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Ind Technol Res Inst, Zhengzhou 450001, Henan, Peoples R China
[3] Henan Key Lab Brain Sci & Brain Comp Interface Te, Zhengzhou 450001, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
Dimensionality Reduction; Intrinsic Dimensionality; Feature Selection; Feature Cluster; PCA; INTEGRATING FEATURE-SELECTION;
D O I
10.1016/j.eswa.2020.113277
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dimensionality reduction is one basic and critical technology for data mining, especially in current "big data" era. As two different types of methods, feature selection and feature extraction each have their pros and cons. In this paper, we combine multi-strategy feature selection and grouped feature extraction and propose a novel fast hybrid dimension reduction method, incorporating their advantages of removing irrelevant and redundant information. Firstly, the intrinsic dimensionality of the data set is estimated by the maximum likelihood estimation method. Fisher Score and Information Gain based feature selection are used as multi-strategy methods to remove irrelevant features. With the redundancy among the selected features as clustering criterion, they are grouped into a certain amount of clusters. In every cluster, Principal Component Analysis (PCA) based feature extraction is carried out to remove redundant information. Four classical classifiers and representation entropy are used to evaluate the classification performance and information loss of the reduced set. The runtime results of different methods show that the proposed hybrid method is consistently much faster than the other three in almost all of the sets used. Meanwhile, the proposed method shows competitive classification performance, which has no significant difference basically compared with the other methods. The proposed method reduces the dimensionality of the raw data fast and it has excellent efficiency and competitive classification performance compared with the contrastive methods. (c) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering
    Bharti, Kusum Kumari
    Singh, Pramod Kumar
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (06) : 3105 - 3114
  • [2] Feature selection for dimensionality reduction
    Mladenic, Dunja
    [J]. SUBSPACE, LATENT STRUCTURE AND FEATURE SELECTION, 2006, 3940 : 84 - 102
  • [3] A Fast Hybrid Feature Selection Method
    Ganjei, Mohammad Ahmadi
    Boostani, Reza
    [J]. 2019 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE 2019), 2019, : 6 - 11
  • [4] Bilinear Lanczos components for fast dimensionality reduction and feature extraction
    Ren, Chuan-Xian
    Dai, Dao-Qing
    [J]. PATTERN RECOGNITION, 2010, 43 (11) : 3742 - 3752
  • [5] Dimensionality reduction-based feature extraction and classification on fleece fabric images
    Yildiz, Kazim
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2017, 11 (02) : 317 - 323
  • [6] Dimensionality reduction-based feature extraction and classification on fleece fabric images
    Kazim Yildiz
    [J]. Signal, Image and Video Processing, 2017, 11 : 317 - 323
  • [7] Fault detection and classification by unsupervised feature extraction and dimensionality reduction
    Praveen Chopra
    Sandeep Kumar Yadav
    [J]. Complex & Intelligent Systems, 2015, 1 (1-4) : 25 - 33
  • [8] Heuristic Search Algorithm for Dimensionality Reduction Optimally Combining Feature Selection and Feature Extraction
    He, Baokun
    Shah, Swair
    Maung, Crystal
    Arnold, Gordon
    Wan, Guihong
    Schweitzer, Haim
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2280 - 2287
  • [9] Multiscale feature extraction for time series classification with hybrid feature selection
    Zhang, Hui
    Lin, Mao-Song
    Huang, Wei
    Kawasaki, Saori
    Ho, Tu Bao
    [J]. INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 939 - 944
  • [10] Algorithmic Feature Selection and Dimensionality Reduction in Signal Classification Tasks
    Zavadil, Jan
    Kus, Vaclav
    Chlada, Milan
    [J]. MATHEMATICAL MODELING IN PHYSICAL SCIENCES, IC-MSQUARE 2023, 2024, 446 : 187 - 193