Sensitivity of Modern Deep Learning Neural Networks to Unbalanced Datasets in Multiclass Classification Problems

被引:1
|
作者
Barulina, Marina [1 ,2 ]
Okunkov, Sergey [1 ,3 ]
Ulitin, Ivan [1 ,3 ]
Sanbaev, Askhat [4 ]
机构
[1] Russian Acad Sci, Inst Precis Mech & Control, 24 Ul Rabochaya, Saratov 410028, Russia
[2] Perm State Univ, Fac Mech & Math, 15 Ul Bukireva, Perm 614068, Russia
[3] Russia Fac Comp Sci & Informat Technol, Saratov Natl Res State Univ, St Astrakhanskaya 83, Saratov 410012, Russia
[4] Omega Clin, 46 Ul Komsomolskaia, Saratov 410031, Russia
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 15期
关键词
deep learning; unbalanced dataset; augmentation; multiclass classification; metrics boosting method; sota algorithm; visual transformer; ResNet; Xception; inception; MACHINE;
D O I
10.3390/app13158614
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Featured Application The results of the work can be used in computer vision systems for medical problems, or other applications where the training data is highly imbalanced. One of the critical problems in multiclass classification tasks is the imbalance of the dataset. This is especially true when using contemporary pre-trained neural networks, where the last layers of the neural network are retrained. Therefore, large datasets with highly unbalanced classes are not good for models' training since the use of such a dataset leads to overfitting and, accordingly, poor metrics on test and validation datasets. In this paper, the sensitivity to a dataset imbalance of Xception, ViT-384, ViT-224, VGG19, ResNet34, ResNet50, ResNet101, Inception_v3, DenseNet201, DenseNet161, DeIT was studied using a highly imbalanced dataset of 20,971 images sorted into 7 classes. It is shown that the best metrics were obtained when using a cropped dataset with augmentation of missing images in classes up to 15% of the initial number. So, the metrics can be increased by 2-6% compared to the metrics of the models on the initial unbalanced data set. Moreover, the metrics of the rare classes' classification also improved significantly-the True Positive value can be increased by 0.3 or more. As a result, the best approach to train considered networks on an initially unbalanced dataset was formulated.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Twin Neural Networks for the classification of large unbalanced datasets
    Jayadeva
    Pant, Himanshu
    Sharma, Mayank
    Soman, Sumit
    [J]. NEUROCOMPUTING, 2019, 343 : 34 - 49
  • [2] EFFICIENT CLASSIFICATION FOR MULTICLASS PROBLEMS USING MODULAR NEURAL NETWORKS
    ANAND, R
    MEHROTRA, K
    MOHAN, CK
    RANKA, S
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (01): : 117 - 124
  • [3] LEARNING MULTICLASS CLASSIFICATION PROBLEMS
    WATKIN, TLH
    RAU, A
    BOLLE, D
    VANMOURIK, J
    [J]. JOURNAL DE PHYSIQUE I, 1992, 2 (02): : 167 - 180
  • [4] Multiclass classification for multidimensional functional data through deep neural networks
    Wang, Shuoyang
    Cao, Guanqun
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (01): : 1248 - 1292
  • [5] Dealing With Highly Unbalanced Sidescan Sonar Image Datasets for Deep Learning Classification Tasks
    Steiniger, Yannik
    Stoppe, Jannis
    Meisen, Tobias
    Kraus, Dieter
    [J]. GLOBAL OCEANS 2020: SINGAPORE - U.S. GULF COAST, 2020,
  • [6] Automatic multiclass classification of laryngeal cancer using deep convolution neural networks
    Munirathinam, Ramesh
    Tamilnidhi, M.
    Thangaraj, Rajasekaran
    Eswaran, Sivaraman
    Chandrasekaran, Gokul
    Kumar, Neelam Sanjeev
    [J]. ELECTRONICS LETTERS, 2024, 60 (01)
  • [7] Learning deep neural networks for node classification
    Li, Bentian
    Pi, Dechang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 137 : 324 - 334
  • [8] Multiclass Classification of Metrologically Resourceful Tripartite Quantum States with Deep Neural Networks
    Rizvi, Syed Muhammad Abuzar
    Asif, Naema
    Ulum, Muhammad Shohibul
    Duong, Trung Q.
    Shin, Hyundong
    [J]. SENSORS, 2022, 22 (18)
  • [9] An index-based classification scheme using neural networks for multiclass problems
    Tso, SK
    Gu, XP
    Zhang, WQ
    [J]. IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 1899 - 1904
  • [10] Multiclass pattern classification using neural networks
    Ou, GB
    Murphey, YL
    Feldkamp, L
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 585 - 588