A review of boosting methods for imbalanced data classification

被引:28
|
作者
Li, Qiujie [1 ,2 ]
Mao, Yaobin [2 ]
机构
[1] Nanjing Forestry Univ, Coll Mech & Elect Engn, Nanjing 210037, Jiangsu, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Automat, Nanjing 210094, Jiangsu, Peoples R China
关键词
Imbalanced data classification; Boosting; Cost-sensitive learning; Data sampling; MARGIN;
D O I
10.1007/s10044-014-0392-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the problem of imbalanced data classification has drawn a significant amount of interest from academia, industry and government funding agencies. The fundamental issue with imbalanced data classification is the imbalanced data has posed a significant drawback of the performance of most standard learning algorithms, which assume or expect balanced class distribution or equal misclassification costs. Boosting is a meta-technique that is applicable to most learning algorithms. This paper gives a review of boosting methods for imbalanced data classification, denoted as IDBoosting (Imbalanced-databoosting), where conventional learning algorithms can be integrated without further modifications. The main focus is on the intrinsic mechanisms without considering implementation detail. Existing methods are catalogued and each class is displayed in detail in terms of design criteria, typical algorithms and performance analysis. The essence of two IDBoosting methods is discovered followed by experimental evidence and useful reference point for future research are also given.
引用
收藏
页码:679 / 693
页数:15
相关论文
共 50 条
  • [1] A review of boosting methods for imbalanced data classification
    Qiujie Li
    Yaobin Mao
    [J]. Pattern Analysis and Applications, 2014, 17 : 679 - 693
  • [2] Boosting methods for multi-class imbalanced data classification: an experimental review
    Tanha, Jafar
    Abdi, Yousef
    Samadi, Negin
    Razzaghi, Nazila
    Asadpour, Mohammad
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [3] Boosting methods for multi-class imbalanced data classification: an experimental review
    Jafar Tanha
    Yousef Abdi
    Negin Samadi
    Nazila Razzaghi
    Mohammad Asadpour
    [J]. Journal of Big Data, 7
  • [4] Review of imbalanced data classification methods
    Li, Yan-Xia
    Chai, Yi
    Hu, You-Qiang
    Yin, Hong-Peng
    [J]. Kongzhi yu Juece/Control and Decision, 2019, 34 (04): : 673 - 688
  • [5] An Imbalanced Data Classification Algorithm Based on Boosting
    Li Qiu-Jie
    Mao Yao-Bin
    Wang Zhi-Quan
    [J]. 2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 3053 - 3057
  • [6] A New Improved Boosting for Imbalanced Data Classification
    Zhang, Zongtang
    Qiu, JiaXing
    Dai, Weiguo
    [J]. 2019 THE 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, CONTROL AND ROBOTICS (EECR 2019), 2019, 533
  • [7] MEBoost: Mixing Estimators with Boosting for Imbalanced Data Classification
    Rayhan, Farshid
    Ahmed, Sajid
    Mahbub, Asif
    Jani, Md. Rafsan
    Shatabda, Swakkhar
    Farid, Dewan Md.
    Rahman, Chowdhury Mofizur
    [J]. 2017 11TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2017,
  • [8] Oversampling boosting for classification of imbalanced software defect data
    Li, Guangling
    Wang, Shihai
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 4149 - 4154
  • [9] CLASSIFICATION OF IMBALANCED DATA: A REVIEW
    Sun, Yanmin
    Wong, Andrew K. C.
    Kamel, Mohamed S.
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2009, 23 (04) : 687 - 719
  • [10] Cost-sensitive boosting for classification of imbalanced data
    Sun, Yamnin
    Kamel, Mohamed S.
    Wong, Andrew K. C.
    Wang, Yang
    [J]. PATTERN RECOGNITION, 2007, 40 (12) : 3358 - 3378