HSDLM: A Hybrid Sampling With Deep Learning Method for Imbalanced Data Classification

被引:27
|
作者
Hasib, Khan Md [1 ]
Towhid, Nurul Akter [2 ]
Islam, Md Rafiqul [3 ]
机构
[1] Ahsanullah Univ Sci & Engn, Dhaka, Bangladesh
[2] Jahangirnagar Univ, Dhaka, Bangladesh
[3] Univ Technol Sydney UTS, Sydney, NSW, Australia
关键词
Class Imbalance; Classification; Deep Learning; ENN; LSTM; Sampling; SMOTE; SUPPORT; SMOTE;
D O I
10.4018/IJCAC.2021100101
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Imbalanced data presents many difficulties, as the majority of learners will be prejudice against the majority class, and in severe cases, may fully disregard the minority class. Over the last few decades, class inequality has been extensively researched using traditional machine learning techniques. However, there is relatively little analytical research in the field of deep learning with class inequality. In this article, the authors classify the imbalanced data with the combination of both sampling method and deep learning method. They propose a novel sampling-based deep learning method (HSDLM) to address the class imbalance problem. They preprocess the data with label encoding and remove the noisy data with the under-sampling technique edited nearest neighbor (ENN) algorithm. They also balance the data using the over-sampling technique SMOTE and apply parallelly three types of long short-term memory networks, which is a deep learning classifier. The experimental findings indicate that HSDLM is a promising and fruitful solution to working with strongly imbalanced datasets.
引用
下载
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [21] Hybrid sampling-based contrastive learning for imbalanced node classification
    Caixia Cui
    Jie Wang
    Wei Wei
    Jiye Liang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 989 - 1001
  • [22] Parallel selective sampling method for imbalanced and large data classification
    D'Addabbo, Annarita
    Maglietta, Rosalia
    PATTERN RECOGNITION LETTERS, 2015, 62 : 61 - 67
  • [23] A cluster-based hybrid sampling approach for imbalanced data classification
    Feng, Shou
    Zhao, Chunhui
    Fu, Ping
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2020, 91 (05):
  • [24] Hygeia: A Multilabel Deep Learning-Based Classification Method for Imbalanced Electrocardiogram Data
    Xu, Xiaolong
    Xu, Haoyan
    Wang, Liying
    Zhang, Yuanyuan
    Xaio, Fu
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (04) : 2480 - 2493
  • [25] A GAN-based hybrid sampling method for imbalanced customer classification
    Zhu, Bing
    Pan, Xin
    vanden Broucke, Seppe
    Xiao, Jin
    INFORMATION SCIENCES, 2022, 609 : 1397 - 1411
  • [26] Hybrid Sampling Method for Overlap Region of ICS Imbalanced Data
    Gao, Bing
    Gu, Zhaojun
    Zhou, Jingxian
    Sui, He
    Computer Engineering and Applications, 2023, 59 (19) : 305 - 315
  • [27] Imbalanced Data Classification Method Based on Ensemble Learning
    Xiang, Yu
    Xie, Yongping
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 18 - 24
  • [28] Classification of Imbalanced Data Using Deep Learning with Adding Noise
    Fan, Wan-Wei
    Lee, Ching-Hung
    JOURNAL OF SENSORS, 2021, 2021 (2021)
  • [29] CVAE-Based Hybrid Sampling Data Augmentation Method and Interpretation for Imbalanced Classification of Gout Disease
    Si, Xiaonan
    Fu, Yifan
    Liu, Xinran
    Wang, Rulin
    Xu, Wenchang
    Wang, Lei
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT I, ICIC 2024, 2024, 14881 : 49 - 60
  • [30] An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling
    Gao, Xin
    Ren, Bing
    Zhang, Hao
    Sun, Bohao
    Li, Junliang
    Xu, Jianhang
    He, Yang
    Li, Kangsheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160