A novel autoencoder approach to feature extraction with linear separability for high-dimensional data

被引:0
|
作者
Zheng J. [1 ]
Qu H. [1 ,2 ]
Li Z. [1 ]
Li L. [1 ]
Tang X. [2 ]
Guo F. [2 ]
机构
[1] College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing
[2] College of Automation, Chongqing University of Posts and Telecommunications, Chongqing
基金
中国国家自然科学基金;
关键词
Autoencoder; Distance metric; Feature extraction;
D O I
10.7717/PEERJ-CS.1061
中图分类号
学科分类号
摘要
Feature extraction often needs to rely on sufficient information of the input data, however, the distribution of the data upon a high-dimensional space is too sparse to provide sufficient information for feature extraction. Furthermore, high dimensionality of the data also creates trouble for the searching of those features scattered in subspaces. As such, it is a tricky task for feature extraction from the data upon a high-dimensional space. To address this issue, this article proposes a novel autoencoder method using Mahalanobis distance metric of rescaling transformation. The key idea of the method is that by implementing Mahalanobis distance metric of rescaling transformation, the difference between the reconstructed distribution and the original distribution can be reduced, so as to improve the ability of feature extraction to the autoencoder. Results show that the proposed approach wins the state-of-the-art methods in terms of both the accuracy of feature extraction and the linear separabilities of the extracted features. We indicate that distance metric-based methods are more suitable for extracting those features with linear separabilities from high-dimensional data than feature selection-based methods. In a high-dimensional space, evaluating feature similarity is relatively easier than evaluating feature importance, so that distance metric methods by evaluating feature similarity gain advantages over feature selection methods by assessing feature importance for feature extraction, while evaluating feature importance is more computationally efficient than evaluating feature similarity. © 2022 Zheng et al.
引用
收藏
相关论文
共 50 条
  • [41] Anomaly detection for high-dimensional data using a novel autoencoder-support vector machine
    Jiang, Zhuo
    Huang, Xiao
    Wang, Rongbin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 9457 - 9469
  • [42] FACO: A Novel Hybrid Feature Selection Algorithm for High-Dimensional Data Classification
    Popoola, Gideon
    Oyeniran, Kayode
    SOUTHEASTCON 2024, 2024, : 61 - 68
  • [43] An Information-theoretic Approach to Unsupervised Feature Selection for High-Dimensional Data
    Huang, Shao-Lun
    Zhang, Lin
    Zheng, Lizhong
    2017 IEEE INFORMATION THEORY WORKSHOP (ITW), 2017, : 434 - 438
  • [44] Manifold feature index: A novel index based on high-dimensional data simplification
    Xu, Chenkai
    Lin, Hongwei
    Fang, Xuansu
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200
  • [45] An information-theoretic approach to unsupervised feature selection for high-dimensional data
    Huang S.-L.
    Xu X.
    Zheng L.
    IEEE Journal on Selected Areas in Information Theory, 2020, 1 (01): : 157 - 166
  • [46] A hybrid feature selection approach based on ensemble method for high-dimensional data
    Rouhi, Amirreza
    Nezamabadi-pour, Hossein
    2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 16 - 20
  • [47] An efficient approach for feature construction of high-dimensional microarray data by random projections
    Tariq, Hassan
    Eldridge, Elf
    Welch, Ian
    PLOS ONE, 2018, 13 (04):
  • [48] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Liu, Yuyang
    Pi, Pengfei
    Luo, Shan
    COMPUTATIONAL STATISTICS, 2023, 38 (02) : 979 - 1000
  • [49] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Yuyang Liu
    Pengfei Pi
    Shan Luo
    Computational Statistics, 2023, 38 : 979 - 1000
  • [50] The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data
    He, Yawei
    Chen, Zehua
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2016, 68 (01) : 155 - 180