A virtual multi-label approach to imbalanced data classification

被引:0
|
作者
Chou, Elizabeth P. [1 ]
Yang, Shan-Ping [1 ]
机构
[1] Natl Chengchi Univ, Dept Stat, Taipei, Taiwan
关键词
Imbalance; Classification; Virtual multi-label; Equal k-means; SUPPORT;
D O I
10.1080/03610918.2022.2049820
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
One of the most challenging issues in machine learning is imbalanced data analysis. Usually, in this type of research, correctly predicting minority labels is more critical than correctly predicting majority labels. However, traditional machine learning techniques easily lead to learning bias. Traditional classifiers tend to place all subjects in the majority group, resulting in biased predictions. Machine learning studies are typically conducted from one of two perspectives: a data-based perspective or a model-based perspective. Oversampling and undersampling are examples of data-based approaches, while the addition of costs, penalties, or weights to optimize the algorithm is typical of a model-based approach. Some ensemble methods have been studied recently. These methods cause various problems, such as overfitting, the omission of some information, and long computation times. In addition, these methods do not apply to all kinds of datasets. Based on this problem, the virtual labels (ViLa) approach for the majority label is proposed to solve the imbalanced problem. A new multiclass classification approach with the equal K-means clustering method is demonstrated in the study. The proposed method is compared with commonly used imbalance problem methods, such as sampling methods (oversampling, undersampling, and SMOTE) and classifier methods (SVM and one-class SVM). The results show that the proposed method performs better when the degree of data imbalance increases and will gradually outperform other methods.
引用
收藏
页码:1461 / 1471
页数:11
相关论文
共 50 条
  • [31] Systematic approach of multi-label classification for production scheduling
    Munoz, Edrisi
    Capon-Garcia, Elisabet
    COMPUTERS & CHEMICAL ENGINEERING, 2019, 122 : 238 - 246
  • [32] Parallelization of Multi-label classification for large data sets
    Biswas, Shinjini
    Devi, V. Susheela
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 2005 - 2010
  • [33] Incremental multi-label classification of evolving data streams
    Yin, Zhiwu
    Huang, Shangteng
    Journal of Computational Information Systems, 2007, 3 (06): : 2189 - 2193
  • [34] HmcNet: A General Approach for Hierarchical Multi-Label Classification
    Huang, Wei
    Chen, Enhong
    Liu, Qi
    Xiong, Hui
    Huang, Zhenya
    Tong, Shiwei
    Zhang, Dan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 8713 - 8728
  • [35] Hierarchical Multi-label Classification Problems: An LCS Approach
    Romao, Luiz Melo
    Nievola, Julio Cesar
    Distributed Computing and Artificial Intelligence, 12th International Conference, 2015, 373 : 97 - 104
  • [36] A Classification Approach with a Reject Option for Multi-label Problems
    Pillai, Ignazio
    Fumera, Giorgio
    Roli, Fabio
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2011, PT I, 2011, 6978 : 98 - 107
  • [37] A multimodal approach for multi-label movie genre classification
    Rafael B. Mangolin
    Rodolfo M. Pereira
    Alceu S. Britto
    Carlos N. Silla
    Valéria D. Feltrim
    Diego Bertolini
    Yandre M. G. Costa
    Multimedia Tools and Applications, 2022, 81 : 19071 - 19096
  • [38] A MULTI-LABEL CLASSIFICATION APPROACH FOR FACIAL EXPRESSION RECOGNITION
    Zhao, Kaili
    Zhang, Honggang
    Dong, Mingzhi
    Guo, Jun
    Qi, Yonggang
    Song, Yi-Zhe
    2013 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP 2013), 2013,
  • [39] Data scarcity, robustness and extreme multi-label classification
    Rohit Babbar
    Bernhard Schölkopf
    Machine Learning, 2019, 108 : 1329 - 1351
  • [40] A multimodal approach for multi-label movie genre classification
    Mangolin, Rafael B.
    Pereira, Rodolfo M.
    Britto, Alceu S., Jr.
    Silla, Carlos N., Jr.
    Feltrim, Valeria D.
    Bertolini, Diego
    Costa, Yandre M. G.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (14) : 19071 - 19096