A Novel Algorithm for Imbalance Data Classification Based on Genetic Algorithm Improved SMOTE

被引:2
|
作者
Kun Jiang
Jing Lu
Kuiliang Xia
机构
[1] HeiHe University,
关键词
Imbalanced dataset; Classification; SMOTE; Sampling rate; Genetic algorithm; Rockburst;
D O I
暂无
中图分类号
学科分类号
摘要
The classification of imbalanced data has been recognized as a crucial problem in machine learning and data mining. In an imbalanced dataset, there are significantly fewer training instances of one class compared to another class. Hence, the minority class instances are much more likely to be misclassified. In the literature, the synthetic minority over-sampling technique (SMOTE) has been developed to deal with the classification of imbalanced datasets. It synthesizes new samples of the minority class to balance the dataset, by re-sampling the instances of the minority class. Nevertheless, the existing algorithms-based SMOTE uses the same sampling rate for all instances of the minority class. This results in sub-optimal performance. To address this issue, we propose a novel genetic algorithm-based SMOTE (GASMOTE) algorithm. The GASMOTE algorithm uses different sampling rates for different minority class instances and finds the combination of optimal sampling rates. The experimental results on ten typical imbalance datasets show that, compared with SMOTE algorithm, GASMOTE can increase 5.9% on F-measure value and 1.6% on G-mean value, and compared with Borderline-SMOTE algorithm, GASMOTE can increase 3.7% on F-measure value and 2.3% on G-mean value. GASMOTE can be used as a new over-sampling technique to deal with imbalance dataset classification problem. We have particularly applied the GASMOTE algorithm to a practical engineering application: prediction of rockburst in the VCR rockburst datasets. The experiment results indicate that the GASMOTE algorithm can accurately predict the rockburst occurrence and hence provides guidance to the design and construction of safe deep mining engineering structures.
引用
收藏
页码:3255 / 3266
页数:11
相关论文
共 50 条
  • [31] An Improved Over-sampling Algorithm based on iForest and SMOTE
    Zheng, Yifeng
    Li, Guohe
    Zhang, Teng
    2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE AND COMPUTER APPLICATIONS (ICSCA 2019), 2019, : 75 - 80
  • [32] Improved cultural algorithm based on genetic algorithm
    Xue, Zhengui
    Guo, Yinan
    2007 IEEE INTERNATIONAL CONFERENCE ON INTEGRATION TECHNOLOGY, PROCEEDINGS, 2007, : 117 - +
  • [33] An adjustable fuzzy classification algorithm using an improved multi-objective genetic strategy based on decomposition for imbalance dataset
    Liu, Ruochen
    Wang, Fangfang
    He, Manman
    Jiao, Licheng
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (03) : 1583 - 1605
  • [34] An adjustable fuzzy classification algorithm using an improved multi-objective genetic strategy based on decomposition for imbalance dataset
    Ruochen Liu
    Fangfang Wang
    Manman He
    Licheng Jiao
    Knowledge and Information Systems, 2019, 61 : 1583 - 1605
  • [35] The Application of SMOTE Algorithm for Unbalanced Data
    Lv, Dong
    Ma, ZhiCheng
    Yang, Shibo
    Li, Xianbo
    Ma, Zhixin
    Jiang, Fan
    AIVR 2018: 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY, 2018, : 6 - 9
  • [36] IHBA: An Improved Homogeneity-Based Algorithm for Data Classification
    Bekaddour, Fatima
    Amine, Chikh Mohammed
    COMPUTER SCIENCE AND ITS APPLICATIONS, CIIA 2015, 2015, 456 : 129 - 140
  • [37] Novel Inversion Algorithm for the Atmospheric Aerosol Extinction Coefficient Based on an Improved Genetic Algorithm
    Hu, Minghuan
    Li, Shun
    Mao, Jiandong
    Li, Juan
    Wang, Qiang
    Zhang, Yi
    PHOTONICS, 2022, 9 (08)
  • [38] A Novel Defogging Algorithm Based on Genetic Algorithm with Analysis of Scientific Data Materials
    Li, Xiao-Guang
    Kang, Li
    ADVANCED BUILDING MATERIALS AND STRUCTURAL ENGINEERING, 2012, 461 : 806 - 809
  • [39] A Novel Dynamic Task Scheduling Algorithm Based on Improved Genetic Algorithm in Cloud Computing
    Ma, Juntao
    Li, Weitao
    Fu, Tian
    Yan, Lili
    Hu, Guojie
    WIRELESS COMMUNICATIONS, NETWORKING AND APPLICATIONS, WCNA 2014, 2016, 348 : 829 - 835
  • [40] A Genetic Programming Based ECOC Algorithm for Microarray Data Classification
    Wang, HanRui
    Li, KeSen
    Liu, KunHong
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT VI, 2017, 10639 : 683 - 691