A Novel Algorithm for Imbalance Data Classification Based on Genetic Algorithm Improved SMOTE

被引:2
|
作者
Kun Jiang
Jing Lu
Kuiliang Xia
机构
[1] HeiHe University,
关键词
Imbalanced dataset; Classification; SMOTE; Sampling rate; Genetic algorithm; Rockburst;
D O I
暂无
中图分类号
学科分类号
摘要
The classification of imbalanced data has been recognized as a crucial problem in machine learning and data mining. In an imbalanced dataset, there are significantly fewer training instances of one class compared to another class. Hence, the minority class instances are much more likely to be misclassified. In the literature, the synthetic minority over-sampling technique (SMOTE) has been developed to deal with the classification of imbalanced datasets. It synthesizes new samples of the minority class to balance the dataset, by re-sampling the instances of the minority class. Nevertheless, the existing algorithms-based SMOTE uses the same sampling rate for all instances of the minority class. This results in sub-optimal performance. To address this issue, we propose a novel genetic algorithm-based SMOTE (GASMOTE) algorithm. The GASMOTE algorithm uses different sampling rates for different minority class instances and finds the combination of optimal sampling rates. The experimental results on ten typical imbalance datasets show that, compared with SMOTE algorithm, GASMOTE can increase 5.9% on F-measure value and 1.6% on G-mean value, and compared with Borderline-SMOTE algorithm, GASMOTE can increase 3.7% on F-measure value and 2.3% on G-mean value. GASMOTE can be used as a new over-sampling technique to deal with imbalance dataset classification problem. We have particularly applied the GASMOTE algorithm to a practical engineering application: prediction of rockburst in the VCR rockburst datasets. The experiment results indicate that the GASMOTE algorithm can accurately predict the rockburst occurrence and hence provides guidance to the design and construction of safe deep mining engineering structures.
引用
收藏
页码:3255 / 3266
页数:11
相关论文
共 50 条
  • [41] Research on Classification of Data Mining Based Niche Genetic Algorithm
    Zhang, Beibei
    Zhu, Li
    Li, Yanli
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 197 - 199
  • [42] A Parallel Classification algorithm based on Hybrid Genetic Algorithm
    Xiong, Zhongyang
    Zhang, Yufang
    Zhang, Lei
    Niu, Shujie
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 3237 - +
  • [43] A Novel Image Restore Method Based on Improved Genetic Algorithm
    Chen Wenjie
    Dou Lihua
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 3081 - 3086
  • [44] A Novel Thrust Allocation Method Based on Improved Genetic Algorithm
    Ding, Fuguang
    Yu, Qingqing
    Xu, Yujie
    Wang, Yuanhui
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 1869 - 1874
  • [45] Data Source Selection Based on an Improved Greedy Genetic Algorithm
    Yang, Jian
    Xing, Chunxiao
    SYMMETRY-BASEL, 2019, 11 (02):
  • [46] Building the classification model based on the genetic algorithm and the improved Bayesian method
    Pham-Toan, Dinh
    Vo-Van, Tai
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (04) : 405 - 421
  • [47] Improved Bat Algorithm Based on RNA Genetic Algorithm
    Geng Y.
    Zhang L.
    Sun Y.
    Fei T.
    Jiang S.
    Ma J.
    Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2019, 52 (03): : 315 - 320
  • [48] An Improved Simulated Annealing Algorithm based on Genetic Algorithm
    Li, Shufei
    MECHATRONICS AND INTELLIGENT MATERIALS II, PTS 1-6, 2012, 490-495 : 267 - 271
  • [49] A Clustering Routing Algorithm Based on Improved Genetic Algorithm
    Jiao W.
    Ding F.
    Shi J.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (06): : 83 - 88
  • [50] A Network Selection Algorithm Based on Improved Genetic Algorithm
    Chen, Juanmin
    Zhang, Damin
    Liu, Dong
    Pan, Zhiyan
    2018 IEEE 18TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2018, : 209 - 214