Exploring of clustering algorithm on class-imbalanced data

被引:0
|
作者
Li Xuan [1 ]
Chen Zhigang [1 ]
Yang Fan [1 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen 361005, Fujian, Peoples R China
关键词
Class-imbalanced Data; Clustering Algorithm; Imbalanced-ratios; CLASSIFICATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Imbalanced data distribution still remains an unsolved problem in data mining and machine learning. This paper introduces the problem of the class-imbalanced data in classification learning and naturally introduces it into the clustering learning since data clustering is an important and frequently used unsupervised learning method. In this paper, two verification methods based on two different aspects of original data are proposed to test and verify the influence of class-imbalanced data on clustering. Furthermore, we also conduct some experiments on different imbalanced-ratios to exploring its importance in clustering algorithm since is a very important factor for the performance in classification learning. Experimental results indicate that the class-imbalance of the dataset can seriously influence the final performance and efficiency of the clustering algorithm, and the higher the ratio, the higher the adverse effects of the clustering performance based on class-imbalanced data.
引用
下载
收藏
页码:89 / 93
页数:5
相关论文
共 50 条
  • [41] SGBGAN: minority class image generation for class-imbalanced datasets
    Wan, Qian
    Guo, Wenhui
    Wang, Yanjiang
    MACHINE VISION AND APPLICATIONS, 2024, 35 (02)
  • [42] Improved shrunken centroid classifiers for high-dimensional class-imbalanced data
    Rok Blagus
    Lara Lusa
    BMC Bioinformatics, 14
  • [43] A novel classification method for class-imbalanced data and its application in microRNA recognition
    Geng X.
    Zhu Y.-Q.
    Yang Z.
    International Journal Bioautomation, 2018, 22 (02) : 133 - 146
  • [44] Robust Visual Recognition with Class-Imbalanced Open-World Noisy Data
    Zhao, Na
    Lee, Gim Hee
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16989 - 16997
  • [45] An Empirical Study on Preprocessing High-dimensional Class-imbalanced Data for Classification
    Yin, Hua
    Gai, Keke
    2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1314 - 1319
  • [46] Comparison of Approaches to Alleviate Problems with High-Dimensional and Class-Imbalanced Data
    Abu Shanab, Ahmad
    Khoshgoftaar, Taghi M.
    Wald, Randall
    Van Hulse, Jason
    2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 234 - 239
  • [47] Improved shrunken centroid classifiers for high-dimensional class-imbalanced data
    Blagus, Rok
    Lusa, Lara
    BMC BIOINFORMATICS, 2013, 14
  • [48] Assessing the Impact of Class-Imbalanced Data for Classifying Relevant/Irrelevant Medline Documents
    Pavon, Reyes
    Laza, Rosalia
    Reboiro-Jato, Miguel
    Fdez-Riverola, Florentino
    5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2011), 2011, 93 : 345 - 353
  • [49] Sensitivity of decision tree algorithm to class-imbalanced bank credit risk early warning
    Lang, Jie
    Sun, Jie
    2014 SEVENTH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL SCIENCES AND OPTIMIZATION (CSO), 2014, : 539 - 543
  • [50] SGBGAN: minority class image generation for class-imbalanced datasets
    Qian Wan
    Wenhui Guo
    Yanjiang Wang
    Machine Vision and Applications, 2024, 35