An Undersampling Method Approaching the Ideal Classification Boundary for Imbalance Problems

被引:2
|
作者
Zhou, Wensheng [1 ,2 ]
Liu, Chen [1 ,2 ]
Yuan, Peng [3 ]
Jiang, Lei [3 ]
机构
[1] Natl Key Lab Offshore Oil & Gas Exploitat, Beijing 100028, Peoples R China
[2] CNOOC Res Inst Ltd, Beijing 100028, Peoples R China
[3] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Xiangtan 411201, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 13期
关键词
classification; cluster-based undersampling; imbalanced problem; optimal number of classifiers;
D O I
10.3390/app14135421
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Data imbalance is a common problem in most practical classification applications of machine learning, and it may lead to classification results that are biased towards the majority class if not dealt with properly. An effective means of solving this problem is undersampling in the borderline area; however, it is difficult to find the area that fits the classification boundary. In this paper, we present a novel undersampling framework, whereby the clustering of samples in the majority class is conducted and segmentation is then performed in the boundary area according to the clusters obtained; this enables a better shape that fits the classification boundary to be obtained via the performance of random sampling in the borderline area of these segments. In addition, we hypothesize that there exists an optimal number of classifiers to be integrated into the method of ensemble learning that utilizes multiple classifiers that have been obtained via sampling to promote the algorithm. After passing the hypothesis test, we apply the improved algorithm to the newly developed method. The experimental results show that the proposed method works well.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] An Oversampling Method for Class Imbalance Problems on Large Datasets
    Rodriguez-Torres, Fredy
    Martinez-Trinidad, Jose F.
    Carrasco-Ochoa, Jesus A.
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [42] A clustering-based adaptive undersampling ensemble method for highly unbalanced data classification
    Yuan, Xiaohan
    Sun, Chuan
    Chen, Shuyu
    APPLIED SOFT COMPUTING, 2024, 159
  • [43] On the homotopy classification of elliptic boundary value problems
    Savin, A
    Schulze, BW
    Sternin, B
    PARTIAL DIFFERENTIAL EQUATIONS AND SPECTRAL THEORY, 2001, 126 : 299 - 305
  • [44] On the homotopy classification of elliptic boundary value problems
    Savin, A.Yu.
    Sternin, B.Yu.
    Doklady Akademii Nauk, 2001, 377 (02) : 165 - 170
  • [45] Classification problems for shifts of modules over a principal ideal domain
    Fagnani, F
    Zampieri, S
    TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1997, 349 (05) : 1993 - 2006
  • [46] Handling imbalance in hierarchical classification problems using local classifiers approaches
    Pereira, Rodolfo M.
    Costa, Yandre M. G.
    Silla, Carlos N.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2021, 35 (04) : 1564 - 1621
  • [47] Handling imbalance in hierarchical classification problems using local classifiers approaches
    Rodolfo M. Pereira
    Yandre M. G. Costa
    Carlos N. Silla
    Data Mining and Knowledge Discovery, 2021, 35 : 1564 - 1621
  • [48] The Problems of Classification: Method of Committees
    Nikonov, Oleg I.
    Chernavin, Fedor P.
    Medvedeva, Marina A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2015 (ICNAAM-2015), 2016, 1738
  • [49] Ordinary systems in contact with ideal surface. Analysis of approaching of integral equations method
    Tikhonov, D.A.
    Sarkisov, G.N.
    Martynov, G.A.
    Zhurnal Fizicheskoj Khimii, 1992, 66 (01):
  • [50] A boundary element method for multiple moving boundary problems
    Wessex Institute of Technology, Ashurst Lodge, Ashurst, Hampshire SO40 7AA, United Kingdom
    不详
    J. Comput. Phys., 2 (501-519):