A Novel Algorithm for Imbalance Data Classification Based on Genetic Algorithm Improved SMOTE

被引:2
|
作者
Kun Jiang
Jing Lu
Kuiliang Xia
机构
[1] HeiHe University,
关键词
Imbalanced dataset; Classification; SMOTE; Sampling rate; Genetic algorithm; Rockburst;
D O I
暂无
中图分类号
学科分类号
摘要
The classification of imbalanced data has been recognized as a crucial problem in machine learning and data mining. In an imbalanced dataset, there are significantly fewer training instances of one class compared to another class. Hence, the minority class instances are much more likely to be misclassified. In the literature, the synthetic minority over-sampling technique (SMOTE) has been developed to deal with the classification of imbalanced datasets. It synthesizes new samples of the minority class to balance the dataset, by re-sampling the instances of the minority class. Nevertheless, the existing algorithms-based SMOTE uses the same sampling rate for all instances of the minority class. This results in sub-optimal performance. To address this issue, we propose a novel genetic algorithm-based SMOTE (GASMOTE) algorithm. The GASMOTE algorithm uses different sampling rates for different minority class instances and finds the combination of optimal sampling rates. The experimental results on ten typical imbalance datasets show that, compared with SMOTE algorithm, GASMOTE can increase 5.9% on F-measure value and 1.6% on G-mean value, and compared with Borderline-SMOTE algorithm, GASMOTE can increase 3.7% on F-measure value and 2.3% on G-mean value. GASMOTE can be used as a new over-sampling technique to deal with imbalance dataset classification problem. We have particularly applied the GASMOTE algorithm to a practical engineering application: prediction of rockburst in the VCR rockburst datasets. The experiment results indicate that the GASMOTE algorithm can accurately predict the rockburst occurrence and hence provides guidance to the design and construction of safe deep mining engineering structures.
引用
收藏
页码:3255 / 3266
页数:11
相关论文
共 50 条
  • [21] An Improved SMOTE Algorithm Using Clustering
    Xiang, Zhao
    Su, Yixin
    Lan, Jian
    Li, Diliang
    Hu, Yuying
    Li, Zixiao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1986 - 1991
  • [22] A method of data classification based on parallel genetic algorithm
    Shi, YX
    Meng, ZQ
    Cai, ZX
    Benhabib, B
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 1217 - 1222
  • [23] A novel approach for classifying imbalance welding data: Mahalanobis genetic algorithm (MGA)
    Mahmoud El-Banna
    The International Journal of Advanced Manufacturing Technology, 2015, 77 : 407 - 425
  • [24] A novel approach for classifying imbalance welding data: Mahalanobis genetic algorithm (MGA)
    El-Banna, Mahmoud, 1600, Springer London (77): : 1 - 4
  • [25] A novel approach for classifying imbalance welding data: Mahalanobis genetic algorithm (MGA)
    El-Banna, Mahmoud
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2015, 77 (1-4): : 407 - 425
  • [26] Inversion of Geophysical Data Based on Improved Genetic Algorithm
    Yu, Xiang
    Lin, Xue-jie
    Yang, Feng
    2015 INTERNATIONAL CONFERENCE ON MATERIALS AND ENGINEERING AND INDUSTRIAL APPLICATIONS (MEIA 2015), 2015, : 315 - 319
  • [27] Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm
    Jeatrakul, Piyasak
    Wong, Kok Wai
    Fung, Chun Che
    NEURAL INFORMATION PROCESSING: MODELS AND APPLICATIONS, PT II, 2010, 6444 : 152 - 159
  • [28] A Hybrid Data Clustering Using Firefly Algorithm Based Improved Genetic Algorithm
    Maheshwar
    Kaushik, Keshav
    Arora, Vikram
    SECOND INTERNATIONAL SYMPOSIUM ON COMPUTER VISION AND THE INTERNET (VISIONNET'15), 2015, 58 : 249 - 256
  • [29] Research on manufacturing text classification based on improved genetic algorithm
    Zhou Kaijun
    Tong Yifei
    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY, 2016, 59
  • [30] Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding
    Wang, Juanjuan
    Xu, Mantao
    Wang, Hui
    Zhang, Jiwu
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1815 - +