Learning classifiers from imbalanced data based on biased minimax probability machine

被引:0
|
作者
Huang, KZ [1 ]
Yang, HQ [1 ]
King, I [1 ]
Lyu, MR [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine learning methods seeking an accurate performance over a full range of instances are not suitable to deal with this problem, since they tend to classify all the data into the majority, usually the less important class. Moreover, some current methods have tried to utilize some intermediate factors, e.g., the distribution of the training set, the decision thresholds or the cost matrices, to influence the bias of the classification. However, it remains uncertain whether these methods can improve the performance in a systematic way. In this paper, we propose a novel model named Biased Minimax Probability Machine. Different from previous methods, this model directly controls the worst-case real accuracy of classification of the future data to build up biased classifiers. Hence, it provides a rigorous treatment on imbalanced data. The experimental results on the novel model comparing with those of three competitive methods, i.e., the Naive Bayesian classifier, the k-Nearest Neighbor method, and the decision tree method C4.5, demonstrate the superiority of our novel model.
引用
收藏
页码:558 / 563
页数:6
相关论文
共 50 条
  • [41] Integrating Data Selection and Extreme Learning Machine for Imbalanced Data
    Mahdiyah, Umi
    Irawan, M. Isa
    Imah, Elly Matul
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2015), 2015, 59 : 221 - 229
  • [42] Ensemble of Classifiers Based on Multiobjective Genetic Sampling for Imbalanced Data
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    Yao, Xin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (06) : 1104 - 1115
  • [43] Modeling of SPT Seismic Liquefaction Data Using Minimax Probability Machine
    Samui P.
    Hariharan R.
    [J]. Geotechnical and Geological Engineering, 2014, 32 (3) : 699 - 703
  • [44] Affinity and class probability-based fuzzy support vector machine for imbalanced data sets
    Tao, Xinmin
    Li, Qing
    Ren, Chao
    Guo, Wenjie
    He, Qing
    Liu, Rui
    Zou, Junrong
    [J]. NEURAL NETWORKS, 2020, 122 (122) : 289 - 307
  • [45] Feature selection based on minimum error minimax probability machine
    Xu, Zenglin
    King, Irwin
    Lyu, Michael R.
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (08) : 1279 - 1292
  • [46] Machine learning-based sensitivity of steel frames with highly imbalanced and data
    Koh, Hyeyoung
    Blum, Hannah B.
    [J]. ENGINEERING STRUCTURES, 2022, 259
  • [47] Adversarial Approaches to Tackle Imbalanced Data in Machine Learning
    Ayoub, Shahnawaz
    Gulzar, Yonis
    Rustamov, Jaloliddin
    Jabbari, Abdoh
    Reegu, Faheem Ahmad
    Turaev, Sherzod
    [J]. SUSTAINABILITY, 2023, 15 (09)
  • [48] A comparative analysis of machine learning techniques for imbalanced data
    Mrad, Ali Ben
    Lahiani, Amine
    Mefteh-Wali, Salma
    Mselmi, Nada
    [J]. ANNALS OF OPERATIONS RESEARCH, 2024,
  • [49] An Improved Extreme Learning Machine for Imbalanced Data Classification
    Zhang, Xiaopeng
    Qin, Liangxi
    [J]. IEEE ACCESS, 2022, 10 : 8634 - 8642
  • [50] A machine learning method for incomplete and imbalanced medical data
    Salman, Issam
    Vomlel, Jiri
    [J]. PROCEEDINGS OF THE 20TH CZECH-JAPAN SEMINAR ON DATA ANALYSIS AND DECISION MAKING UNDER UNCERTAINTY, 2017, : 188 - 195