Transfer synthetic over-sampling for class-imbalance learning with limited minority class data

被引:0
|
作者
Xu-Ying Liu
Sheng-Tao Wang
Min-Ling Zhang
机构
[1] Southeast University,School of Computer Science and Engineering
[2] Ministry of Education,Key Laboratory of Computer Network and Information Integration (Southeast University)
[3] Collaborative Innovation Center for Wireless Communications Technology,undefined
来源
关键词
machine learning; data mining; class imbalance; over sampling; boosting; transfer learning;
D O I
暂无
中图分类号
学科分类号
摘要
The problem of limited minority class data is encountered in many class imbalanced applications, but has received little attention. Synthetic over-sampling, as popular class-imbalance learning methods, could introduce much noise when minority class has limited data since the synthetic samples are not i.i.d. samples of minority class. Most sophisticated synthetic sampling methods tackle this problem by denoising or generating samples more consistent with ground-truth data distribution. But their assumptions about true noise or ground-truth data distribution may not hold. To adapt synthetic sampling to the problem of limited minority class data, the proposed Traso framework treats synthetic minority class samples as an additional data source, and exploits transfer learning to transfer knowledge from them to minority class. As an implementation, TrasoBoost method firstly generates synthetic samples to balance class sizes. Then in each boosting iteration, the weights of synthetic samples and original data decrease and increase respectively when being misclassified, and remain unchanged otherwise. The misclassified synthetic samples are potential noise, and thus have smaller influence in the following iterations. Besides, the weights of minority class instances have greater change than those of majority class instances to be more influential. And only original data are used to estimate error rate to be immune from noise. Finally, since the synthetic samples are highly related to minority class, all of the weak learners are aggregated for prediction. Experimental results show TrasoBoost outperforms many popular class-imbalance learning methods.
引用
收藏
页码:996 / 1009
页数:13
相关论文
共 50 条
  • [1] Transfer synthetic over-sampling for class-imbalance learning with limited minority class data
    Liu, Xu-Ying
    Wang, Sheng-Tao
    Zhang, Min-Ling
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 996 - 1009
  • [2] On the Use of Surrounding Neighbors for Synthetic Over-Sampling of the Minority Class
    Garcia, V.
    Sanchez, J. S.
    Mollineda, R. A.
    [J]. SMO 08: PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON SIMULATION, MODELLING AND OPTIMIZATION, 2008, : 389 - +
  • [3] RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problem
    Soltanzadeh, Paria
    Hashemzadeh, Mahdi
    [J]. INFORMATION SCIENCES, 2021, 542 : 92 - 111
  • [4] Adaptive Sampling with Optimal Cost for Class-Imbalance Learning
    Peng, Yuxin
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2921 - 2927
  • [5] DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance Learning
    Guan, Hongtao
    Ma, Xingkong
    Shen, Siqi
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT III, 2020, 12454 : 609 - 622
  • [6] Exploratory under-sampling for class-imbalance learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    [J]. ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 965 - 969
  • [7] Manifold Distance-Based Over-Sampling Technique for Class Imbalance Learning
    Yang, Lingkai
    Guo, Yinan
    Cheng, Jian
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10071 - 10072
  • [8] Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets
    Rivera, William A.
    [J]. INFORMATION SCIENCES, 2017, 408 : 146 - 161
  • [9] Trainable Undersampling for Class-Imbalance Learning
    Peng, Minlong
    Zhang, Qi
    Xing, Xiaoyu
    Gui, Tao
    Huang, Xuanjing
    Jiang, Yu-Gang
    Ding, Keyu
    Chen, Zhigang
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4707 - 4714
  • [10] Exploratory Undersampling for Class-Imbalance Learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02): : 539 - 550