Making class bias useful: A strategy of learning from imbalanced data

被引:0
|
作者
Gu, Jie [1 ]
Zhou, Yuanbing [2 ]
Zu, Xianqiang [2 ]
机构
[1] Natl Tsing Hua Univ, Software Sch, Hsinchu, Taiwan
[2] State Power Econom Res Inst, Nanjing, Peoples R China
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of many learning methods are usually influenced by the class imbalance problem, where the training data, is dominated by the instances belonging to one class. In this paper, we propose a novel method which combines random forest based techniques and sampling methods for effectively learning from imbalanced data. Our method is mainly composed of two phases: data cleaning and classification based on random forest. Firstly, the training data is cleaned through the elimination of dangerous negative instances. The data cleaning process is supervised by a negative biased random forest, where the negative instances have a, major proportion of the training data in each of the tree in the forest. Secondly, we develop a, variant of random forest in which each tree is biased towards the positive class to classify the data set, where a major vote is provided for prediction. In the experimental test, we compared our method with other existing methods on the real data sets, and the results demonstrate the significative performance improvement of our method in terms of the area under the ROC curve(AUC).
引用
收藏
页码:287 / +
页数:3
相关论文
共 50 条
  • [21] Ensemble learning method based on CNN for class imbalanced data
    Xin Zhong
    Nan Wang
    [J]. The Journal of Supercomputing, 2024, 80 : 10090 - 10121
  • [22] Ensemble learning method based on CNN for class imbalanced data
    Zhong, Xin
    Wang, Nan
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (07): : 10090 - 10121
  • [23] Online Automated Machine Learning for Class Imbalanced Data Streams
    Wang, Zhaoyang
    Wang, Shuo
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [24] A Novel Data Representation for Effective Learning in Class Imbalanced Scenarios
    Dumpala, Sri Harsha
    Chakraborty, Rupayan
    Kopparapu, Sunil Kumar
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2100 - 2106
  • [25] Learning Fairly With Class-Imbalanced Data for Interference Coordination
    Guo, Jia
    Xu, Zhaoqi
    Yang, Chenyang
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (07) : 7176 - 7181
  • [26] A Two-step Information Accumulation Strategy for Learning from Highly Imbalanced Data
    Liu, Bin
    Zhang, Min
    Ma, Weizhi
    Li, Xin
    Liu, Yiqun
    Ma, Shaoping
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1289 - 1298
  • [27] Learning from class-imbalanced data: review of data driven methods and algorithm driven methods
    Huang, Cui Yin
    Dai, Hong Liang
    [J]. DATA SCIENCE IN FINANCE AND ECONOMICS, 2021, 1 (01): : 21 - 36
  • [28] Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data
    S. Priya
    R. Annie Uthra
    [J]. Complex & Intelligent Systems, 2023, 9 : 3499 - 3515
  • [29] Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data
    Priya, S.
    Uthra, R. Annie
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 3499 - 3515
  • [30] Imbalanced Class Learning in Epigenetics
    Haque, M. Muksitul
    Skinner, Michael K.
    Holder, Lawrence B.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (07) : 492 - 507