A New Performance Measure for Class Imbalance Learning. Application to Bioinformatics Problems

被引:26
|
作者
Batuwita, Rukshan [1 ]
Palade, Vasile [1 ]
机构
[1] Univ Oxford, Comp Lab, Oxford OX1 3QD, England
关键词
Performance Measures; Class Imbalance Learning; Bioinformatics; Model Selection; SVMs; CLASSIFICATION;
D O I
10.1109/ICMLA.2009.126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In class imbalance learning, the performance measure used for the model selection would play a vital role. It has been well-studied in the past research that the most widely used performance measure, the overall accuracy of the model, can lead to sub-optimal classification models when learning from imbalanced datasets. In order to overcome this problem, other performance measures, such as the Geometric-mean (Gm) and F-measure (Fm), have been used for imbalanced dataset learning. Training a classifier system with an imbalanced dataset (where the positive class is the minority class) would usually produce sub-optimal models having a higher Specificity (SP) and a lower Sensitivity (SE). By applying class imbalance learning methods, we would often be able to increase the SE by sacrificing some amount of SP. In some type of real world imbalanced classification problems, such as the gene finding Bioinformatics problems, it is important to improve the SE as much as possible by keeping the reduction of SP to the minimum. In this paper, we show that with respect to this type of classification problems the existing performance measures used in class imbalance learning (Gm and Fm) can still result in sub-optimal classification models. In order to circumvent these problems, we introduced a new performance measure, called Adjusted Geometric-mean (AGm). We show, both analytically and empirically on two real-world Bioinformatics datasets, that AGm can perform better than Gm and Fm metrics.
引用
收藏
页码:545 / 550
页数:6
相关论文
共 50 条
  • [31] A NEW APPROACH BASED ON CLASS IMBALANCE LEARNING FOR SMALL-BUSINESSES' CREDIT ASSESSMENT
    Hao, Xiying
    Dong, Yanwen
    [J]. ICIM2012: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON INDUSTRIAL MANAGEMENT, 2012, : 149 - 153
  • [32] Assessing and mitigating the effects of class imbalance in machine learning with application to X-ray imaging
    Qu, Wendi
    Balki, Indranil
    Mendez, Mauro
    Valen, John
    Levman, Jacob
    Tyrrell, Pascal N.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2020, 15 (12) : 2041 - 2048
  • [33] Assessing and mitigating the effects of class imbalance in machine learning with application to X-ray imaging
    Wendi Qu
    Indranil Balki
    Mauro Mendez
    John Valen
    Jacob Levman
    Pascal N. Tyrrell
    [J]. International Journal of Computer Assisted Radiology and Surgery, 2020, 15 : 2041 - 2048
  • [34] A STUDY OF MACHINE LEARNING ALGORITHMS TO MEASURE THE FEATURE IMPORTANCE IN CLASS-IMBALANCE DATA OF FOOD INSECURITY CASES IN INDONESIA
    Dharmawan, H.
    Sartono, B.
    Kurnia, A.
    Hadi, A. F.
    Ramadhani, E.
    [J]. COMMUNICATIONS IN MATHEMATICAL BIOLOGY AND NEUROSCIENCE, 2022,
  • [35] Combination of Traditional and Deep Learning based Architectures to Overcome Class Imbalance and its Application to Malware Classification
    Messay-Kebede, Temesguen
    Narayanan, Barath Narayanan
    Djaneye-Boundjou, Ouboti
    [J]. NAECON 2018 - IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE, 2018, : 73 - 77
  • [36] Generalizable Feature Learning in the Presence of Data Bias and Domain Class Imbalance with Application to Skin Lesion Classification
    Yoon, Chris
    Hamarneh, Ghassan
    Garbi, Rafeef
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 365 - 373
  • [37] PBC4cip: A new contrast pattern-based classifier for class imbalance problems
    Loyola-Gonzalez, Octavio
    Angel Medina-Perez, Miguel
    Martinez-Trinidad, Jose Fco.
    Ariel Carrasco-Ochoa, Jesus
    Monroy, Raul
    Garcia-Borroto, Milton
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 115 : 100 - 109
  • [38] High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance
    Liu, Yang
    Li, Xiang
    Chen, Xianbang
    Wang, Xi
    Li, Huaqiang
    [J]. SCIENTIFIC PROGRAMMING, 2020, 2020
  • [39] Correcting for the effects of class imbalance improves the performance of machine-learning based species distribution models
    Benkendorf, Donald J.
    Schwartz, Samuel D.
    Cutler, D. Richard
    Hawkins, Charles P.
    [J]. ECOLOGICAL MODELLING, 2023, 483
  • [40] A New Performance Evaluation Method for Two-Class Imbalanced Problems
    Garcia, Vicente
    Mollineda, Ramon A.
    Sanchez, J. Salvador
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2008, 5342 : 917 - 925