Cost-sensitive classification with inadequate labeled data

被引:32
|
作者
Wang, Tao [2 ]
Qin, Zhenxing [2 ]
Zhang, Shichao [1 ]
Zhang, Chengqi [2 ]
机构
[1] Guangxi Normal Univ, Coll Comp Sci & Informat Technol, Guilin, Peoples R China
[2] Univ Technol Sydney, Fac Engn & Informat Technol, Broadway, NSW 2007, Australia
基金
澳大利亚研究理事会;
关键词
Cost-sensitive learning; Classification; Semi-supervised learning; Expectation maximization;
D O I
10.1016/j.is.2011.10.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is an actual and challenging issue to learn cost-sensitive models from those datasets that are with few labeled data and plentiful unlabeled data, because some time labeled data are very difficult, time consuming and/or expensive to obtain. To solve this issue, in this paper we proposed two classification strategies to learn cost-sensitive classifier from training datasets with both labeled and unlabeled data, based on Expectation Maximization (EM). The first method, Direct-EM, uses EM to build a semi-supervised classifier, then directly computes the optimal class label for each test example using the class probability produced by the learning model. The second method, CS-EM, modifies EM by incorporating misclassification cost into the probability estimation process. We conducted extensive experiments to evaluate the efficiency, and results show that when using only a small number of labeled training examples, the CS-EM outperforms the other competing methods on majority of the selected UCI data sets across different cost ratios, especially when cost ratio is high. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:508 / 516
页数:9
相关论文
共 50 条
  • [1] Cost-sensitive boosting for classification of imbalanced data
    Sun, Yamnin
    Kamel, Mohamed S.
    Wong, Andrew K. C.
    Wang, Yang
    [J]. PATTERN RECOGNITION, 2007, 40 (12) : 3358 - 3378
  • [2] COST-SENSITIVE SPFCNN MINER FOR CLASSIFICATION OF IMBALANCED DATA
    Zhao, Linchang
    Shang, Zhaowei
    Zhao, Ling
    Wei, Yu
    Tang, Yuan Yan
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2019, : 51 - 57
  • [3] Cost-sensitive Naive Bayes Classification of Uncertain Data
    Zhang, Xing
    Li, Mei
    Zhang, Yang
    Ning, Jifeng
    [J]. JOURNAL OF COMPUTERS, 2014, 9 (08) : 1897 - 1903
  • [4] Cost-sensitive classification with time constraint on incomplete data
    Lee, Yong-Shiuan
    Wu, Chia-Chi
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (03)
  • [5] Cost-sensitive KNN classification
    Zhang, Shichao
    [J]. NEUROCOMPUTING, 2020, 391 : 234 - 242
  • [6] Adversarial Cost-Sensitive Classification
    Asif, Kaiser
    Xing, Wei
    Behpour, Sima
    Ziebart, Brian D.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 92 - 101
  • [7] Cost-sensitive Texture Classification
    Schaefer, Gerald
    Krawczyk, Bartosz
    Doshi, Niraj P.
    Nakashima, Tomoharu
    [J]. 2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2014, : 105 - 108
  • [8] Cost-Sensitive Online Classification
    Wang, Jialei
    Zhao, Peilin
    Hoi, Steven C. H.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (10) : 2425 - 2438
  • [9] Cost-Sensitive Online Classification
    Wang, Jialei
    Zhao, Peilin
    Hoi, Steven C. H.
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 1140 - 1145
  • [10] Cost-Sensitive Variational Autoencoding Classifier for Imbalanced Data Classification
    Liu, Fen
    Qian, Quan
    [J]. ALGORITHMS, 2022, 15 (05)