Boosting support vector machines for imbalanced data sets

被引:0
|
作者
Benjamin X. Wang
Nathalie Japkowicz
机构
[1] Datalong technology Ltd.,School of Information Technology and Engineering
[2] University of Ottawa,undefined
来源
关键词
Imbalanced data sets; Support vector machines; Boosting;
D O I
暂无
中图分类号
学科分类号
摘要
Real world data mining applications must address the issue of learning from imbalanced data sets. The problem occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed vector spaces or lack of information. Common approaches for dealing with the class imbalance problem involve modifying the data distribution or modifying the classifier. In this work, we choose to use a combination of both approaches. We use support vector machines with soft margins as the base classifier to solve the skewed vector spaces problem. We then counter the excessive bias introduced by this approach with a boosting algorithm. We found that this ensemble of SVMs makes an impressive improvement in prediction performance, not only for the majority class, but also for the minority class.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 50 条
  • [41] Affinity and class probability-based fuzzy support vector machine for imbalanced data sets
    Tao, Xinmin
    Li, Qing
    Ren, Chao
    Guo, Wenjie
    He, Qing
    Liu, Rui
    Zou, Junrong
    NEURAL NETWORKS, 2020, 122 (122) : 289 - 307
  • [42] A comparison of validation methods for learning vector quantization and for support vector machines on two biomedical data sets
    Sommer, D
    Golz, M
    From Data and Information Analysis to Knowledge Engineering, 2006, : 150 - 157
  • [43] Twin support vector machines based on rough sets
    Yu, J. (junzhao1989@163.com), 1600, Advanced Institute of Convergence Information Technology, Myoungbo Bldg 3F,, Bumin-dong 1-ga, Seo-gu, Busan, 602-816, Korea, Republic of (06):
  • [44] Boosting support vector machines for VHTS, ADME and QSAR.
    Breneman, CM
    Bennett, KP
    Embrechts, MJ
    Bi, J
    Demiriz, A
    Lockwood, L
    Momma, M
    Sukumar, N
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2001, 221 : U393 - U393
  • [45] Selecting training sets for support vector machines: a review
    Nalepa, Jakub
    Kawulok, Michal
    ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (02) : 857 - 900
  • [46] miRBoost: boosting support vector machines for microRNA precursor classification
    Tran, Van Du T.
    Tempel, Sebastien
    Zerath, Benjamin
    Zehraoui, Farida
    Tahi, Fariza
    RNA, 2015, 21 (05) : 775 - 785
  • [47] Selecting training sets for support vector machines: a review
    Jakub Nalepa
    Michal Kawulok
    Artificial Intelligence Review, 2019, 52 : 857 - 900
  • [48] Learning Confidence Sets using Support Vector Machines
    Wang, Wenbo
    Qiao, Xingye
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [49] Hybrid classification algorithms based on boosting and support vector machines
    Maia, Thiago Turchetti
    Braga, Antonio Padua
    de Carvalho, Andre F.
    KYBERNETES, 2008, 37 (9-10) : 1469 - 1491
  • [50] One-class support vector machines for large-scale data sets
    Wang, H. (hgwang@tsinghua.edu.cn), 2013, Southeast University (43):