A novel autoencoder approach to feature extraction with linear separability for high-dimensional data

被引:0
|
作者
Zheng J. [1 ]
Qu H. [1 ,2 ]
Li Z. [1 ]
Li L. [1 ]
Tang X. [2 ]
Guo F. [2 ]
机构
[1] College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing
[2] College of Automation, Chongqing University of Posts and Telecommunications, Chongqing
基金
中国国家自然科学基金;
关键词
Autoencoder; Distance metric; Feature extraction;
D O I
10.7717/PEERJ-CS.1061
中图分类号
学科分类号
摘要
Feature extraction often needs to rely on sufficient information of the input data, however, the distribution of the data upon a high-dimensional space is too sparse to provide sufficient information for feature extraction. Furthermore, high dimensionality of the data also creates trouble for the searching of those features scattered in subspaces. As such, it is a tricky task for feature extraction from the data upon a high-dimensional space. To address this issue, this article proposes a novel autoencoder method using Mahalanobis distance metric of rescaling transformation. The key idea of the method is that by implementing Mahalanobis distance metric of rescaling transformation, the difference between the reconstructed distribution and the original distribution can be reduced, so as to improve the ability of feature extraction to the autoencoder. Results show that the proposed approach wins the state-of-the-art methods in terms of both the accuracy of feature extraction and the linear separabilities of the extracted features. We indicate that distance metric-based methods are more suitable for extracting those features with linear separabilities from high-dimensional data than feature selection-based methods. In a high-dimensional space, evaluating feature similarity is relatively easier than evaluating feature importance, so that distance metric methods by evaluating feature similarity gain advantages over feature selection methods by assessing feature importance for feature extraction, while evaluating feature importance is more computationally efficient than evaluating feature similarity. © 2022 Zheng et al.
引用
收藏
相关论文
共 50 条
  • [21] A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis
    Borah, Kasmika
    Das, Himanish Shekhar
    Seth, Soumita
    Mallick, Koushik
    Rahaman, Zubair
    Mallik, Saurav
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2024, 24 (05)
  • [22] An Incremental Autoencoder Approach for Data Stream Feature Extraction
    Aydogdu, Ozge
    Ekinci, Murat
    2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 261 - 264
  • [23] Effective Feature Extraction in High-Dimensional Space
    Pang, Yanwei
    Yuan, Yuan
    Li, Xuelong
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1652 - 1656
  • [24] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [25] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    Computational Management Science, 2009, 6 (1) : 25 - 40
  • [26] Scalable High-Dimensional Multivariate Linear Regression for Feature-Distributed Data
    Huang, Shuo-Chieh
    Tsay, Ruey S.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [27] Feature extraction based on a linear separability criterion
    Xu, Yong
    Song, Fengxi
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (04): : 857 - 865
  • [28] Feature Extraction for Outlier Detection in High-Dimensional Spaces
    Hoang Vu Nguyen
    Gopalkrishnan, Vivekanand
    PROCEEDINGS OF THE FOURTH INTERNATIONAL WORKSHOP ON FEATURE SELECTION IN DATA MINING, 2010, 10 : 66 - 75
  • [29] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [30] Feature selection for high-dimensional data in astronomy
    Zheng, Hongwen
    Zhang, Yanxia
    ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964