A Cost-sensitive Active Learning for Imbalance Data with Uncertainty and Diversity Combination

被引:5
|
作者
Dong, Huailong [1 ]
Zhu, Bowen [1 ]
Zhang, Jing [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, 200 Xiaolingwei St, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Active Learning; Imbalanced Learning; Cost-Sensitive Learning;
D O I
10.1145/3383972.3384002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The class distributions of real-world classification datasets are usually imbalanced because many applications, such as network intrusion detection, tumor classification, financial risk identification, etc., exhibit imbalance natures that positive examples are rare. When labeling such data to create training sets for supervised learning, too many examples belonging to the majority class will be labeled, which dramatically increase the labeling cost and usually is unnecessary, because balanced datasets are more suitable for inducing good learners. To deal with this problem, this paper proposes a novel cost-sensitive active learning algorithm that combines the uncertainty and diversity measures to select training examples for an unlabeled sample pool. We use the proportions of the majority and the minority against the whole examples in the training dataset as the weights of the majority class and the minority class, respectively. With the class weights, the minor examples can obtain more emphasis when building learning models. Experimental results show that our proposed method can significantly reduce the label cost while improving the performance of learning models.
引用
下载
收藏
页码:218 / 224
页数:7
相关论文
共 50 条
  • [1] Cost-Sensitive Active Learning for Incomplete Data
    Wang, Min
    Yang, Chunyu
    Zhao, Fei
    Min, Fan
    Wang, Xizhao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (01): : 405 - 416
  • [2] Active Cost-Sensitive Learning
    Margineantu, Dragos D.
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1622 - 1623
  • [3] A Novel Uncertainty Sampling Algorithm for Cost-sensitive Multiclass Active Learning
    Huang, Kuan-Hao
    Lin, Hsuan-Tien
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 925 - 930
  • [4] Active Learning for Cost-Sensitive Classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daume, Hal, III
    Langford, John
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [5] Active Learning for Cost-Sensitive Classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daume, Hal, III
    Langford, John
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [6] Active learning for cost-sensitive classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daumé Iii, Hal
    Langford, John
    Journal of Machine Learning Research, 2019, 20
  • [7] Learning cost-sensitive active classifiers
    Greiner, R
    Grove, AJ
    Roth, D
    ARTIFICIAL INTELLIGENCE, 2002, 139 (02) : 137 - 174
  • [8] Cost-Sensitive Active Visual Category Learning
    Vijayanarasimhan, Sudheendra
    Grauman, Kristen
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 91 (01) : 24 - 44
  • [9] Ensemble of Cost-Sensitive Hypernetworks for Class-Imbalance Learning
    Wang, Jin
    Huang, Ping-li
    Sun, Kai-wei
    Cao, Bao-lin
    Zhao, Rui
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 1883 - 1888
  • [10] The influence of class imbalance on cost-sensitive learning: An empirical study
    Liu, Xu-Ying
    Zhou, Zhi-Hua
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 970 - +