Exploring of clustering algorithm on class-imbalanced data

被引:0
|
作者
Li Xuan [1 ]
Chen Zhigang [1 ]
Yang Fan [1 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen 361005, Fujian, Peoples R China
关键词
Class-imbalanced Data; Clustering Algorithm; Imbalanced-ratios; CLASSIFICATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Imbalanced data distribution still remains an unsolved problem in data mining and machine learning. This paper introduces the problem of the class-imbalanced data in classification learning and naturally introduces it into the clustering learning since data clustering is an important and frequently used unsupervised learning method. In this paper, two verification methods based on two different aspects of original data are proposed to test and verify the influence of class-imbalanced data on clustering. Furthermore, we also conduct some experiments on different imbalanced-ratios to exploring its importance in clustering algorithm since is a very important factor for the performance in classification learning. Experimental results indicate that the class-imbalance of the dataset can seriously influence the final performance and efficiency of the clustering algorithm, and the higher the ratio, the higher the adverse effects of the clustering performance based on class-imbalanced data.
引用
下载
收藏
页码:89 / 93
页数:5
相关论文
共 50 条
  • [21] Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions
    Ahmed, Ismail
    Pariente, Antoine
    Tubert-Bitter, Pascale
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2018, 27 (03) : 785 - 797
  • [22] Unsupervised Seismic Facies Analysis via Class-Imbalanced Deep Embedding Clustering
    Hua, Haowei
    Qian, Feng
    Zhang, Gulan
    Yue, Yuehua
    Hu, Guangmin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20 : 1 - 5
  • [23] A Hybrid Framework for Class-Imbalanced Classification
    Chen, Rui
    Luo, Lailong
    Chen, Yingwen
    Xia, Junxu
    Guo, Deke
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 301 - 313
  • [24] A deep multimodal generative and fusion framework for class-imbalanced multimodal data
    Li, Qing
    Yu, Guanyuan
    Wang, Jun
    Liu, Yuehao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (33-34) : 25023 - 25050
  • [25] GANs for Class-Imbalanced Data: A Meta-Analysis of GitHub Projects
    Sauber-Cole, Rick
    Khoshgoftaar, Taghi M.
    Johnson, Justin M.
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1419 - 1424
  • [26] Dynamic financial distress prediction based on class-imbalanced data batches
    Sun, Jie
    Liu, Xin
    Ai, Wenguo
    Tian, Qianyuan
    INTERNATIONAL JOURNAL OF FINANCIAL ENGINEERING, 2021, 8 (03)
  • [27] Kernel Matrix Approximation on Class-Imbalanced Data With an Application to Scientific Simulation
    Hajibabaee, Parisa
    Pourkamali-Anaraki, Farhad
    Hariri-Ardebili, Mohammad Amin
    IEEE ACCESS, 2021, 9 : 83579 - 83591
  • [28] Evaluation of SMOTE for high-dimensional class-imbalanced microarray data
    Blagus, Rok
    Lusa, Lara
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 89 - 94
  • [29] A deep multimodal generative and fusion framework for class-imbalanced multimodal data
    Qing Li
    Guanyuan Yu
    Jun Wang
    Yuehao Liu
    Multimedia Tools and Applications, 2020, 79 : 25023 - 25050
  • [30] Ensemble Strategy for Hard Classifying Samples in Class-Imbalanced Data Set
    Yang, Yingze
    Xiao, Pengcheng
    Cheng, Yijun
    Liu, Weirong
    Huang, Zhiwu
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 170 - 175