Transfer learning for class imbalance problems with inadequate data

被引:75
|
作者
Al-Stouhi, Samir [1 ]
Reddy, Chandan K. [2 ]
机构
[1] Honda Automobile Technol Res, Southfield, MI USA
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Rare class; Transfer learning; Class imbalance; AdaBoost; Weighted majority algorithm; HealthCare informatics; Text mining;
D O I
10.1007/s10115-015-0870-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data are not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting-based instance transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.
引用
收藏
页码:201 / 228
页数:28
相关论文
共 50 条
  • [1] Transfer learning for class imbalance problems with inadequate data
    Samir Al-Stouhi
    Chandan K. Reddy
    [J]. Knowledge and Information Systems, 2016, 48 : 201 - 228
  • [2] Strategies for learning in class imbalance problems
    Barandela, R
    Sánchez, JS
    García, V
    Rangel, E
    [J]. PATTERN RECOGNITION, 2003, 36 (03) : 849 - 851
  • [3] Ensemble Methods with Statistics and Machine Learning on the Class Imbalance Problems of EEG data
    Mishra, Sneha
    Jaiswal, Umesh Chandra
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 453 - 462
  • [4] Unsupervised Ensemble Learning for Class Imbalance Problems
    Liu, Zihan
    Wu, Dongrui
    [J]. 2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 3593 - 3600
  • [5] Transfer Learning on Decision Tree with Class Imbalance
    Minvielle, Ludovic
    Atiq, Mounir
    Peignier, Sergio
    Mougeot, Mathilde
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1003 - 1010
  • [6] Transfer synthetic over-sampling for class-imbalance learning with limited minority class data
    Liu, Xu-Ying
    Wang, Sheng-Tao
    Zhang, Min-Ling
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 996 - 1009
  • [7] Transfer synthetic over-sampling for class-imbalance learning with limited minority class data
    Xu-Ying Liu
    Sheng-Tao Wang
    Min-Ling Zhang
    [J]. Frontiers of Computer Science, 2019, 13 : 996 - 1009
  • [8] Learning from data streams and class imbalance
    Wang, Shuo
    Minku, Leandro L.
    Chawla, Nitesh
    Yao, Xin
    [J]. CONNECTION SCIENCE, 2019, 31 (02) : 103 - 104
  • [9] Comparing Transfer Learning and Traditional Learning Under Domain Class Imbalance
    Weiss, Karl R.
    Khoshgoftaar, Taghi M.
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 337 - 343
  • [10] Generating Data to Alleviate Data Imbalance Problems in Machine Learning
    Niimi, Ayahiko
    Sakamoto, Kosuke
    [J]. FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 534 - 541