Cost-Sensitive Learning based on Performance Metric for Imbalanced Data

被引:0
|
作者
Aurelio, Yuri Sousa [1 ]
de Almeida, Gustavo Matheus [1 ]
de Castro, Cristiano Leite [1 ]
Braga, Antonio Padua [1 ]
机构
[1] Univ Fed Minas Gerais, Belo Horizonte, MG, Brazil
关键词
Classification; Imbalanced problem; Cost-sensitive function; Multi-Layer perceptron; Back-propagation; Confusion matrix; CLASSIFICATION; IDENTIFICATION; ENSEMBLE;
D O I
10.1007/s11063-022-10756-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Performance metrics are usually evaluated only after the neural network learning process using an error cost function. This procedure can result in suboptimal model selection, particularly for imbalanced classification problems. This work proposes the direct use of these metrics as cost functions, which are often derived from the confusion matrix. Commonly used metrics are covered, namely AUC, G-mean, F1-score and AG-mean. The only implementation change for model training occurs in the backpropagation error term. The results were compared to the standard MLP using the Rprop learning algorithm, SMOTE, SMTTL, WWE and RAMOBoost. Sixteen classical benchmark datasets were used in the experiments. Based on average ranks, the proposed formulation outperformed Rprop and all sampling strategies, namely SMOTE, SMTTL and WWE, for all metrics. These results were statistically confirmed for AUC and G-mean in relation to Rprop. For F1-score and AG-mean, all algorithms were considered statistically equivalent. The proposal was also superior to RAMOBoost for G-mean given average ranks. However, it was statistically faster than RAMOBoost for all metrics. It was also faster than SMTTL and statistically equivalent to Rprop, SMOTE and WWE. More, the solutions obtained are generally non-dominated ones compared to all other techniques, for all metrics. The results showed that the direct use of performance metrics as cost functions for neural network training favors generalization capacity and also computation time in imbalanced classification problems. Its extension to other performance metrics derived directly from the confusion matrix is straightforward.
引用
收藏
页码:3097 / 3114
页数:18
相关论文
共 50 条
  • [1] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Yuri Sousa Aurelio
    Gustavo Matheus de Almeida
    Cristiano Leite de Castro
    Antonio Padua Braga
    [J]. Neural Processing Letters, 2022, 54 : 3097 - 3114
  • [2] Cost-sensitive learning for imbalanced data streams
    Loezer, Lucas
    Enembreck, Fabricio
    Barddal, Jean Paul
    Britto Jr, Alceu de Souza
    [J]. PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 498 - 504
  • [3] Cost-Sensitive Learning Methods for Imbalanced Data
    Nguyen Thai-Nghe
    Gantner, Zeno
    Schmidt-Thieme, Lars
    [J]. 2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [4] Cost-sensitive learning for imbalanced medical data: a review
    Araf, Imane
    Idri, Ali
    Chairi, Ikram
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [5] On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling
    Krawczyk, Bartosz
    Wozniak, Michal
    [J]. COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 180 - 191
  • [6] Cost-sensitive learning for imbalanced medical data: a review
    Imane Araf
    Ali Idri
    Ikram Chairi
    [J]. Artificial Intelligence Review, 57
  • [7] Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data
    Khan, Salman H.
    Hayat, Munawar
    Bennamoun, Mohammed
    Sohel, Ferdous A.
    Togneri, Roberto
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3573 - 3587
  • [8] A Cost-Sensitive Learning Strategy for Feature Extraction from Imbalanced Data
    Braytee, Ali
    Liu, Wei
    Kennedy, Paul
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2016, PT III, 2016, 9949 : 78 - 86
  • [9] Cost-sensitive boosting for classification of imbalanced data
    Sun, Yamnin
    Kamel, Mohamed S.
    Wong, Andrew K. C.
    Wang, Yang
    [J]. PATTERN RECOGNITION, 2007, 40 (12) : 3358 - 3378
  • [10] Cost-sensitive sparse group online learning for imbalanced data streams
    Chen, Zhong
    Sheng, Victor
    Edwards, Andrea
    Zhang, Kun
    [J]. MACHINE LEARNING, 2024, 113 (07) : 4407 - 4444