Making class bias useful: A strategy of learning from imbalanced data

被引:0
|
作者
Gu, Jie [1 ]
Zhou, Yuanbing [2 ]
Zu, Xianqiang [2 ]
机构
[1] Natl Tsing Hua Univ, Software Sch, Hsinchu, Taiwan
[2] State Power Econom Res Inst, Nanjing, Peoples R China
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of many learning methods are usually influenced by the class imbalance problem, where the training data, is dominated by the instances belonging to one class. In this paper, we propose a novel method which combines random forest based techniques and sampling methods for effectively learning from imbalanced data. Our method is mainly composed of two phases: data cleaning and classification based on random forest. Firstly, the training data is cleaned through the elimination of dangerous negative instances. The data cleaning process is supervised by a negative biased random forest, where the negative instances have a, major proportion of the training data in each of the tree in the forest. Secondly, we develop a, variant of random forest in which each tree is biased towards the positive class to classify the data set, where a major vote is provided for prediction. In the experimental test, we compared our method with other existing methods on the real data sets, and the results demonstrate the significative performance improvement of our method in terms of the area under the ROC curve(AUC).
引用
收藏
页码:287 / +
页数:3
相关论文
共 50 条
  • [1] Learning from Combination of Data Chunks for Multi-class Imbalanced Data
    Liu, Xu-Ying
    Li, Qian-Qian
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1680 - 1687
  • [2] Active Learning for Decision-Making from Imbalanced Observational Data
    Sundin, Iiris
    Schulam, Peter
    Siivola, Eero
    Vehtari, Aki
    Saria, Suchi
    Kaski, Samuel
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [3] Learning from class-imbalanced data: Review of methods and applications
    Guo Haixiang
    Li Yijing
    Shang, Jennifer
    Gu Mingyun
    Huang Yuanyue
    Bing, Gong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 : 220 - 239
  • [4] Learning from class-imbalanced data in wireless sensor networks
    Radivojac, P
    Korad, U
    Sivalingam, KM
    Obradovic, Z
    [J]. 2003 IEEE 58TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS1-5, PROCEEDINGS, 2003, : 3030 - 3034
  • [5] Learning from Imbalanced Data
    He, Haibo
    Garcia, Edwardo A.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) : 1263 - 1284
  • [6] Twice Class Bias Correction for Imbalanced Semi-supervised Learning
    Li, Lan
    Tao, Bowen
    Han, Lu
    Zhan, De-chuan
    Ye, Han-jia
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13563 - 13571
  • [7] Learning With Imbalanced Noisy Data by Preventing Bias in Sample Selection
    Liu, Huafeng
    Sheng, Mengmeng
    Sun, Zeren
    Yao, Yazhou
    Hua, Xian-Sheng
    Shen, Heng-Tao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7426 - 7437
  • [8] Types of minority class examples and their influence on learning classifiers from imbalanced data
    Napierala, Krystyna
    Stefanowski, Jerzy
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2016, 46 (03) : 563 - 597
  • [9] OAHO: an effective algorithm for multi-class learning from imbalanced data
    Murphey, Yi L.
    Wang, Haoxing
    Ou, Guobin
    Feldkamp, Lee A.
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 406 - +
  • [10] Efficient Learning From Two-Class Categorical Imbalanced Healthcare Data
    Mathews, Lincy
    Seetha, Hari
    [J]. INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2021, 16 (01) : 81 - 100