A virtual multi-label approach to imbalanced data classification

被引:0
|
作者
Chou, Elizabeth P. [1 ]
Yang, Shan-Ping [1 ]
机构
[1] Natl Chengchi Univ, Dept Stat, Taipei, Taiwan
关键词
Imbalance; Classification; Virtual multi-label; Equal k-means; SUPPORT;
D O I
10.1080/03610918.2022.2049820
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
One of the most challenging issues in machine learning is imbalanced data analysis. Usually, in this type of research, correctly predicting minority labels is more critical than correctly predicting majority labels. However, traditional machine learning techniques easily lead to learning bias. Traditional classifiers tend to place all subjects in the majority group, resulting in biased predictions. Machine learning studies are typically conducted from one of two perspectives: a data-based perspective or a model-based perspective. Oversampling and undersampling are examples of data-based approaches, while the addition of costs, penalties, or weights to optimize the algorithm is typical of a model-based approach. Some ensemble methods have been studied recently. These methods cause various problems, such as overfitting, the omission of some information, and long computation times. In addition, these methods do not apply to all kinds of datasets. Based on this problem, the virtual labels (ViLa) approach for the majority label is proposed to solve the imbalanced problem. A new multiclass classification approach with the equal K-means clustering method is demonstrated in the study. The proposed method is compared with commonly used imbalance problem methods, such as sampling methods (oversampling, undersampling, and SMOTE) and classifier methods (SVM and one-class SVM). The results show that the proposed method performs better when the degree of data imbalance increases and will gradually outperform other methods.
引用
收藏
页码:1461 / 1471
页数:11
相关论文
共 50 条
  • [1] A Multi-label Multimodal Deep Learning Framework for Imbalanced Data Classification
    Pouyanfar, Samira
    Wang, Tianyi
    Chen, Shu-Ching
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 199 - 204
  • [2] PLM: Partial Label Masking for Imbalanced Multi-label Classification
    Duarte, Kevin
    Rawat, Yogesh
    Shah, Mubarak
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2733 - 2742
  • [3] A Combined Approach for Multi-Label Text Data Classification
    Strimaitis, Rokas
    Stefanovic, Pavel
    Ramanauskaite, Simona
    Slotkiene, Asta
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [4] Imbalanced Networked Multi-Label Classification with Active Learning
    Zhang, Ruilong
    Li, Lei
    Zhang, Yuhong
    Bu, Chenyang
    2018 9TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK), 2018, : 290 - 297
  • [5] EnvBERT: Multi-label Text Classification for Imbalanced, Noisy Environmental News Data
    Kim, Dohyung
    Koo, Jahwan
    Kim, Ung-Mo
    PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021), 2021,
  • [6] Multi-Label Classification for Automatic Human Blastocyst Grading with Severely Imbalanced Data
    Lockhart, Lisette
    Saeedi, Parvaneh
    Au, Jason
    Havelock, Jon
    2019 IEEE 21ST INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2019), 2019,
  • [7] Semi-supervised imbalanced multi-label classification with label propagation
    Du, Guodong
    Zhang, Jia
    Zhang, Ning
    Wu, Hanrui
    Wu, Peiliang
    Li, Shaozi
    PATTERN RECOGNITION, 2024, 150
  • [8] Multi-label imbalanced classification based on assessments of cost and value
    Ding, Mengxiao
    Yang, Youlong
    Lan, Zhiqing
    APPLIED INTELLIGENCE, 2018, 48 (10) : 3577 - 3590
  • [9] Multi-label imbalanced classification based on assessments of cost and value
    Mengxiao Ding
    Youlong Yang
    Zhiqing Lan
    Applied Intelligence, 2018, 48 : 3577 - 3590
  • [10] Efficient classification of multi-label and imbalanced data using min-max modular classifiers
    Chen, Ken
    Lu, Bao-Liang
    Kwok, James T.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1770 - +