Fuzzy Distance-based Undersampling Technique for Imbalanced Flood Data

被引:0
|
作者
Mahamud, Ku Ruhana Ku [1 ]
Zorkeflee, Maisarah [1 ]
Din, Aniza Mohamed [1 ]
机构
[1] Univ Utara Malaysia, Changlun, Malaysia
关键词
imbalanced flood data; resampling technique; fuzzy distance-based undersampling; fuzzy logic;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases. Numerous resampling techniques such as undersampling and oversampling have been used to overcome the problem of misclassification of imbalanced data. However, the undersampling and oversampling techniques suffer from elimination of relevant data and overfitting, which may lead to poor classification results. This paper proposes a Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy thresholds which are used to categorise the instances in majority and minority classes into membership functions. The performance of FDUS was compared with three techniques based on Fmeasure and G-mean, experimented on flood data. From the results, FDUS achieved better F-measure and G-mean compared to the other techniques which showed that the FDUS was able to reduce the elimination of relevant data.
引用
收藏
页码:509 / 513
页数:5
相关论文
共 50 条
  • [41] Partial Undersampling of Imbalanced Data for Cyber Threats Detection
    Moniruzzaman, Md
    Bagirov, A. M.
    Gondal, Iqbal
    [J]. PROCEEDINGS OF THE AUSTRALASIAN COMPUTER SCIENCE WEEK MULTICONFERENCE (ACSW 2020), 2020,
  • [42] Undersampling Instance Selection for Hybrid and Incomplete Imbalanced Data
    Camacho-Nieto, Oscar
    Yanez-Marquez, Cornelio
    Villuendas-Rey, Yenny
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2020, 26 (06) : 698 - 719
  • [43] An Iterative Undersampling of Extremely Imbalanced Data Using CSVM
    Lee, Jong Bum
    Lee, Jee-Hyong
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2014), 2015, 9445
  • [44] Hellinger distance-based stable sparse feature selection for high-dimensional class-imbalanced data
    Guang-Hui Fu
    Yuan-Jiao Wu
    Min-Jie Zong
    Jianxin Pan
    [J]. BMC Bioinformatics, 21
  • [45] Hellinger distance-based stable sparse feature selection for high-dimensional class-imbalanced data
    Fu, Guang-Hui
    Wu, Yuan-Jiao
    Zong, Min-Jie
    Pan, Jianxin
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [46] A Novel Selective Ensemble Algorithm for Imbalanced Data Classification Based on Exploratory Undersampling
    Yin, Qing-Yan
    Zhang, Jiang-She
    Zhang, Chun-Xia
    Ji, Nan-Nan
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [47] On Properties of Undersampling Bagging and Its Extensions for Imbalanced Data
    Stefanowski, Jerzy
    [J]. PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS, CORES 2015, 2016, 403 : 407 - 417
  • [48] Weighted distance-based trees for ranking data
    Antonella Plaia
    Mariangela Sciandra
    [J]. Advances in Data Analysis and Classification, 2019, 13 : 427 - 444
  • [49] Distance-based tree models for ranking data
    Lee, Paul H.
    Yu, Philip L. H.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (06) : 1672 - 1682
  • [50] Spatial Distribution-Based Imbalanced Undersampling
    Yan, Yuanting
    Zhu, Yuanwei
    Liu, Ruiqing
    Zhang, Yiwen
    Zhang, Yanping
    Zhang, Ling
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 6376 - 6391