SVM classification for imbalanced data sets using a multiobjective optimization framework

被引:12
|
作者
Askan, Aysegul [1 ]
Sayin, Serpil [2 ]
机构
[1] Garanti Teknol, TR-34212 Istanbul, Turkey
[2] Koc Univ, Coll Adm Sci & Econ, TR-34450 Istanbul, Turkey
关键词
SVM; Imbalanced data; Multiobjective optimization; Efficient frontier; ROBUST;
D O I
10.1007/s10479-012-1300-5
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Classification of imbalanced data sets in which negative instances outnumber the positive instances is a significant challenge. These data sets are commonly encountered in real-life problems. However, performance of well-known classifiers is limited in such cases. Various solution approaches have been proposed for the class imbalance problem using either data-level or algorithm-level modifications. Support Vector Machines (SVMs) that have a solid theoretical background also encounter a dramatic decrease in performance when the data distribution is imbalanced. In this study, we propose an L-1-norm SVM approach that is based on a three objective optimization problem so as to incorporate into the formulation the error sums for the two classes independently. Motivated by the inherent multi objective nature of the SVMs, the solution approach utilizes a reduction into two criteria formulations and investigates the efficient frontier systematically. The results indicate that a comprehensive treatment of distinct positive and negative error levels may lead to performance improvements that have varying degrees of increased computational effort.
引用
收藏
页码:191 / 203
页数:13
相关论文
共 50 条
  • [31] An Improved Algorithm for SVMs Classification of Imbalanced Data Sets
    Castro, Cristiano Leite
    Carvalho, Mateus Araujo
    Braga, Antonio Padua
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PROCEEDINGS, 2009, 43 : 108 - 118
  • [32] Classification of imbalanced marketing data with balanced random sets
    Nikulin, Vladimir
    McLachlan, Geoffrey J.
    Journal of Machine Learning Research, 2009, 7 : 89 - 100
  • [33] Multiobjective hybrid monarch butterfly optimization for imbalanced disease classification problem
    Nalluri, MadhuSudana Rao
    Kannan, Krithivasan
    Gao, Xiao-Zhi
    Roy, Diptendu Sinha
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (07) : 1423 - 1451
  • [34] Instance importance based SVM for solving imbalanced data classification
    Yang, Yang
    Li, Shan-Ping
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2009, 22 (06): : 913 - 918
  • [35] Imbalanced Data Classification Based on a Hybrid Resampling SVM Method
    Cao, Lu
    Zhai, Yikui
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 1533 - 1536
  • [36] WOA plus BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network
    Hassib, Eslam M.
    El-Desouky, Ali, I
    Labib, Labib M.
    El-kenawy, El-Sayed M.
    SOFT COMPUTING, 2020, 24 (08) : 5573 - 5592
  • [37] Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets
    Gonzalez-Lima, Maria D.
    Ludena, Carenne C.
    MATHEMATICS, 2022, 10 (11)
  • [38] A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets
    Fernandez, Alberto
    Garcia, Salvador
    Jose del Jesus, Maria
    Herrera, Francisco
    FUZZY SETS AND SYSTEMS, 2008, 159 (18) : 2378 - 2398
  • [39] A New Segmented Oversampling Method for Imbalanced Data Classification Using Quasi-Linear SVM
    Zhou, Bo
    Li, Weite
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2017, 12 (06) : 891 - 898
  • [40] On strategies for imbalanced text classification using SVM: A comparative study
    Sun, Aixin
    Lim, Ee-Peng
    Liu, Ying
    DECISION SUPPORT SYSTEMS, 2009, 48 (01) : 191 - 201