An oversampling method for multi-class imbalanced data based on composite weights

被引:9
|
作者
Deng, Mingyang [1 ,2 ]
Guo, Yingshi [1 ]
Wang, Chang [1 ]
Wu, Fuwei [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian, Peoples R China
[2] Changchun Univ Technol, Coll Automobile Engn, Coll Humanities & Informat, Changchun, Peoples R China
来源
PLOS ONE | 2021年 / 16卷 / 11期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
ALGORITHM; CLASSIFICATION; SMOTE;
D O I
10.1371/journal.pone.0259227
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
To solve the oversampling problem of multi-class small samples and to improve their classification accuracy, we develop an oversampling method based on classification ranking and weight setting. The designed oversampling algorithm sorts the data within each class of dataset according to the distance from original data to the hyperplane. Furthermore, iterative sampling is performed within the class and inter-class sampling is adopted at the boundaries of adjacent classes according to the sampling weight composed of data density and data sorting. Finally, information assignment is performed on all newly generated sampling data. The training and testing experiments of the algorithm are conducted by using the UCI imbalanced datasets, and the established composite metrics are used to evaluate the performance of the proposed algorithm and other algorithms in comprehensive evaluation method. The results show that the proposed algorithm makes the multi-class imbalanced data balanced in terms of quantity, and the newly generated data maintain the distribution characteristics and information properties of the original samples. Moreover, compared with other algorithms such as SMOTE and SVMOM, the proposed algorithm has reached a higher classification accuracy of about 90%. It is concluded that this algorithm has high practicability and general characteristics for imbalanced multi-class samples.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] An Algorithm for Selective Preprocessing of Multi-class Imbalanced Data
    Wojciechowski, Szymon
    Wilk, Szymon
    Stefanowski, Jerzy
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS CORES 2017, 2018, 578 : 238 - 247
  • [22] A survey of multi-class imbalanced data classification methods
    Han, Meng
    Li, Ang
    Gao, Zhihui
    Mu, Dongliang
    Liu, Shujuan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2471 - 2501
  • [23] A Dynamic Sampling Framework for Multi-Class Imbalanced Data
    Debowski, B.
    Areibi, S.
    Grewal, G.
    Tempelman, J.
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 113 - 118
  • [24] Multi-class imbalanced big data classification on Spark
    Sleeman, William C.
    Krawczyk, Bartosz
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [25] Multi-Class Imbalance Classification Based on Data Distribution and Adaptive Weights
    Li, Shuxian
    Song, Liyan
    Wu, Xiaoyu
    Hu, Zheng
    Cheung, Yiu-ming
    Yao, Xin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (10) : 5265 - 5279
  • [26] Hybrid Sampling and Dynamic Weighting-Based Classification Method for Multi-Class Imbalanced Data Stream
    Han, Meng
    Li, Ang
    Gao, Zhihui
    Mu, Dongliang
    Liu, Shujuan
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (10):
  • [27] Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm
    Du L.-M.
    Xu Y.
    Zhu H.
    [J]. Ann. Data Sci., 3 (293-300): : 293 - 300
  • [28] Learning from Combination of Data Chunks for Multi-class Imbalanced Data
    Liu, Xu-Ying
    Li, Qian-Qian
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1680 - 1687
  • [29] An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification
    Arafat, Md. Yasir
    Hoque, Sabera
    Xu, Shuxiang
    Farid, Dewan Md.
    [J]. 2019 13TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2019,
  • [30] AUC Evaluation of Multi-class Classifier Performance in Imbalanced Data
    Ni, Huangjing
    Wang, Wei
    [J]. 2010 INTERNATIONAL CONFERENCE ON FUTURE CONTROL AND AUTOMATION (ICFCA 2010), 2010, : 48 - 51