A novel oversampling technique based on the manifold distance for class imbalance learning

被引:2
|
作者
Guo, Yinan [1 ]
Jiao, Botao [1 ]
Yang, Lingkai [1 ]
Cheng, Jian [2 ]
Yang, Shengxiang [3 ]
Tang, Fengzhen [4 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
[2] China Coal Res Inst, Beijing 100013, Peoples R China
[3] De Montfort Univ, Leicester LE1 9BH, Leics, England
[4] Shenyang Inst Automat, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
class imbalance learning; oversampling; manifold learning; overlapping; small disjunction; OPTIMIZATION; ENSEMBLE;
D O I
10.1504/IJBIC.2021.119197
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oversampling is a popular problem-solver for class imbalance learning by generating more minority samples to balance the dataset size of different classes. However, resampling in original space is ineffective for the imbalance datasets with class overlapping or small disjunction. Based on this, a novel oversampling technique based on manifold distance is proposed, in which a new minority sample is produced in terms of the distances among neighbours in manifold space, rather than the Euclidean distance among them. After mapping the original data to its manifold structure, the overlapped majority and minority samples will lie in areas easily being partitioned. In addition, the new samples are generated based on the neighbours locating nearby in manifold space, avoiding the adverse effect of the disjoint minority classes. Following that, an adaptive adjustment method is presented to determine the number of the newly generated minority samples according to the distribution density of the matched-pair data. The experimental results on 48 imbalanced datasets indicate that the proposed oversampling technique has the better classification accuracy.
引用
收藏
页码:131 / 142
页数:12
相关论文
共 50 条
  • [41] Handling Class Imbalance Problem using Oversampling Techniques: A Review
    Gosain, Anjana
    Sardana, Saanchi
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 79 - 85
  • [42] An Ensemble Learning-Based Undersampling Technique for Handling Class-Imbalance Problem
    Sarkar, Sobhan
    Khatedi, Nikhil
    Pramanik, Anima
    Maiti, J.
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 586 - 595
  • [43] Instance hardness and multivariate Gaussian distribution-based oversampling technique for imbalance classification
    Xie, Jie
    Zhu, Mingying
    Hu, Kai
    Zhang, Jinglan
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (02) : 735 - 749
  • [44] A Robust Oversampling Approach for Class Imbalance Problem With Small Disjuncts
    Sun, Yi
    Cai, Lijun
    Liao, Bo
    Zhu, Wen
    Xu, Junlin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5550 - 5562
  • [45] Instance hardness and multivariate Gaussian distribution-based oversampling technique for imbalance classification
    Jie Xie
    Mingying Zhu
    Kai Hu
    Jinglan Zhang
    Pattern Analysis and Applications, 2023, 26 : 735 - 749
  • [46] Distribution Based Ensemble for Class Imbalance Learning
    Mustafa, Ghulam
    Niu, Zhendong
    Yousif, Abdallah
    Tarus, John
    FIFTH INTERNATIONAL CONFERENCE ON THE INNOVATIVE COMPUTING TECHNOLOGY (INTECH 2015), 2015, : 5 - 10
  • [47] A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Murase, Kazuyuki
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 735 - +
  • [48] Addressing the Big Data Multi-class Imbalance Problem with Oversampling and Deep Learning Neural Networks
    Gonzalez-Barcenas, V. M.
    Rendon, E.
    Alejo, R.
    Granda-Gutierrez, E. E.
    Valdovinos, R. M.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT I, 2020, 11867 : 216 - 224
  • [49] Attention features selection oversampling technique (AFS-O) for rolling bearing fault diagnosis with class imbalance
    Han, Zhongze
    Wang, Haoran
    Shen, Chen
    Song, Xuewei
    Cao, Longchao
    Yu, Lianqing
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (03)
  • [50] A Gaussian-Based WGAN-GP Oversampling Approach for Solving the Class Imbalance Problem
    Zhou, Qian
    Sun, Bo
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2024, 34 (02) : 291 - 307