Data Entropy-Based Imbalanced Learning

被引:0
|
作者
Fan, Yutao [1 ,2 ,3 ,4 ]
Huang, Heming [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
[4] North China Inst Sci & Technol, Beijing 065201, Peoples R China
基金
中国国家自然科学基金;
关键词
data entropy; deep learning; imbalanced learning; NETWORKS;
D O I
10.1007/978-3-031-67871-4_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
All the time the skewness of observations is thought as the reason of poor classification performance, especially the bias in classification performance among classes in machine learning. However, our recent study challenges this notion. We argue that the bias of classification performance comes from the imbalance of information of classes rather than just that of observations. To reflect the information imbalance of classes, we propose an indicator data entropy that captures the randomness within classes. A dataset with balanced and higher data entropies across its classes is more likely to exhibit improved classification performance. Furthermore, we propose another indicator data mutual information that quantifies the similarity between classes. Higher values indicates that the models can leverage learning from classes to enhance learning capacity. Therefore, reducing the difference in data entropy between classes and enhancing data mutual information concurrently is advantageous for classification. Our experiments, conducted across four models SVM, CNN, Transformer (including its variants ViT), and DNN, on datasets CIFAR-10, Airline Satisfaction, Smoking Body Signal and Liver Cirrhosis, validate the efficacy of our proposed indicators. Through rebalancing the data entropy distribution among classes and increasing the data entropy within classes as well as the data mutual information in the Liver Cirrhosis dataset using resampling techniques, we observe classification enhancements measured in d-index across four models.
引用
收藏
页码:95 / 109
页数:15
相关论文
共 50 条
  • [1] Entropy-based matrix learning machine for imbalanced data sets
    Zhu, Changming
    Wang, Zhe
    PATTERN RECOGNITION LETTERS, 2017, 88 : 72 - 80
  • [2] Entropy-based hybrid sampling ensemble learning for imbalanced data
    Dongdong, Li
    Ziqiu, Chi
    Bolu, Wang
    Zhe, Wang
    Hai, Yang
    Wenli, Du
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (07) : 3039 - 3067
  • [3] EWGAN: Entropy-Based Wasserstein GAN for Imbalanced Learning
    Ren, Jinfu
    Liu, Yang
    Liu, Jiming
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10011 - 10012
  • [4] Entropy-Based Fuzzy Weighted Logistic Regression for Classifying Imbalanced Data
    Harumeka, Ajiwasesa
    Purnami, Santi Wulan
    Rahayu, Santi Puteri
    SOFT COMPUTING IN DATA SCIENCE, SCDS 2021, 2021, 1489 : 312 - 327
  • [5] Entropy-Based Learning of Compositional Models from Data
    Jirousek, Radim
    Kratochvil, Vaclav
    Shenoy, Prakash P.
    BELIEF FUNCTIONS: THEORY AND APPLICATIONS (BELIEF 2021), 2021, 12915 : 117 - 126
  • [6] On entropy-based data mining
    Holzinger, Andreas
    Hörtenhuber, Matthias
    Mayer, Christopher
    Bachler, Martin
    Wassertheurer, Siegfried
    Pinho, Armando J
    Koslicki, David
    1600, Springer Verlag (8401): : 209 - 226
  • [7] Entropy-based fuzzy support vector machine for imbalanced datasets
    Fan, Qi
    Wang, Zhe
    Li, Dongdong
    Gao, Daqi
    Zha, Hongyuan
    KNOWLEDGE-BASED SYSTEMS, 2017, 115 : 87 - 99
  • [8] Zero Initialised Unsupervised Active Learning by Optimally Balanced Entropy-Based Sampling for Imbalanced Problems
    Szucs, Gabor
    Papp, David
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (05) : 781 - 814
  • [9] Entropy-Based Classifier Enhancement to Handle Imbalanced Class Problem
    Kirshners, Arnis
    Parshutin, Sergei
    Gorskis, Henrihs
    ICTE 2016, 2017, 104 : 586 - 591
  • [10] Entropy-based learning of sensing matrices
    Parthasarathy, Gayatri
    Abhilash, G.
    IET SIGNAL PROCESSING, 2019, 13 (07) : 650 - 660