Reducing Data Complexity Using Autoencoders With Class-Informed Loss Functions

被引:10
|
作者
Charte, David [1 ]
Charte, Francisco [2 ]
Herrera, Francisco [1 ,3 ]
机构
[1] Univ Granada, Comp Sci & AI Dept, Granada 18071, Spain
[2] Univ Jaen, Comp Sci Dept, Jaen 23071, Spain
[3] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah 21589, Saudi Arabia
关键词
Complexity theory; Feature extraction; Measurement; Shape; Support vector machines; Data models; Transforms; Autoencoders; dimension reduction; data complexity; DIMENSIONALITY REDUCTION; FEATURE-SELECTION;
D O I
10.1109/TPAMI.2021.3127698
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Available data in machine learning applications is becoming increasingly complex, due to higher dimensionality and difficult classes. There exists a wide variety of approaches to measuring complexity of labeled data, according to class overlap, separability or boundary shapes, as well as group morphology. Many techniques can transform the data in order to find better features, but few focus on specifically reducing data complexity. Most data transformation methods mainly treat the dimensionality aspect, leaving aside the available information within class labels which can be useful when classes are somehow complex. This paper proposes an autoencoder-based approach to complexity reduction, using class labels in order to inform the loss function about the adequacy of the generated variables. This leads to three different new feature learners, Scorer, Skaler and Slicer. They are based on Fisher's discriminant ratio, the Kullback-Leibler divergence and least-squares support vector machines, respectively. They can be applied as a preprocessing stage for a binary classification problem. A thorough experimentation across a collection of 27 datasets and a range of complexity and classification metrics shows that class-informed autoencoders perform better than 4 other popular unsupervised feature extraction techniques, especially when the final objective is using the data for a classification task.
引用
收藏
页码:9549 / 9560
页数:12
相关论文
共 34 条
  • [1] Reducing Dimensionality of Data Using Autoencoders
    Janakiramaiah, B.
    Kalyani, G.
    Narayana, S.
    Krishna, T. Bala Murali
    [J]. SMART INTELLIGENT COMPUTING AND APPLICATIONS, VOL 2, 2020, 160 : 51 - 58
  • [2] Post-consultation acute respiratory tract infection recovery: a latent class-informed analysis of individual patient data
    Hounkpatin, Hilda
    Stuart, Beth
    Zhu, Shihua
    Yao, Guiqing
    Moore, Michael
    Loffler, Christin
    Little, Paul
    Kenealy, Timothy
    Gillespie, David
    Francis, Nick A.
    Bostock, Jennifer
    Becque, Taeko
    Arroll, Bruce
    Altiner, Attila
    Alonso-Coello, Pablo
    Hay, Alastair D.
    [J]. BRITISH JOURNAL OF GENERAL PRACTICE, 2023, 73 (728): : E196 - E203
  • [3] Unsupervised fault detection in refrigeration showcase with single class data using autoencoders
    Santana, Adamo
    Kawamura, Yu
    Murakami, Kenya
    Iizaka, Tatsuya
    Matsui, Tetsuro
    Fukuyama, Yoshikazu
    [J]. IEEJ Transactions on Electronics, Information and Systems, 2019, 139 (10) : 1191 - 1200
  • [4] Synthesizing Data Using Variational Autoencoders for Handling Class Imbalanced Deep Learning
    Sheikh, Taimoor Shakeel
    Khan, Adil
    Fahim, Muhammad
    Ahmad, Muhammad
    [J]. ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019), 2020, 1086 : 270 - 281
  • [5] Analysis of multisubject neuroimaging data using anatomically informed basis functions
    Kiebel, SJ
    Friston, KJ
    [J]. NEUROIMAGE, 2001, 13 (06) : S172 - S172
  • [6] Audio-visual speech synthesis using vision transformer–enhanced autoencoders with ensemble of loss functions
    Subhayu Ghosh
    Snehashis Sarkar
    Sovan Ghosh
    Frank Zalkow
    Nanda Dulal Jana
    [J]. Applied Intelligence, 2024, 54 : 4507 - 4524
  • [7] Audio-visual speech synthesis using vision transformer-enhanced autoencoders with ensemble of loss functions
    Ghosh, Subhayu
    Sarkar, Snehashis
    Ghosh, Sovan
    Zalkow, Frank
    Jana, Nanda Dulal
    [J]. APPLIED INTELLIGENCE, 2024, 54 (06) : 4507 - 4524
  • [8] Steganography Technique to Prevent Data Loss by Using Boolean Functions
    Dash, Satya Ranjan
    Sen, Alo
    Hassan, Sk. Sarif
    Roy, Rahul
    Misra, Chinmaya
    Singh, Kamakhya Narain
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2017, 2018, 668 : 263 - 270
  • [9] Parameter Identification for Constrained Data Using a New Class of Rational Fractal Functions
    Katiyar, S. K.
    Chand, A. K. B.
    Jha, S.
    [J]. NUMERICAL ANALYSIS AND APPLICATIONS, 2021, 14 (03) : 225 - 237
  • [10] Parameter Identification for Constrained Data Using a New Class of Rational Fractal Functions
    S. K. Katiyar
    A. K. B. Chand
    S. Jha
    [J]. Numerical Analysis and Applications, 2021, 14 : 225 - 237