A novel oversampling technique based on the manifold distance for class imbalance learning

被引:2
|
作者
Guo, Yinan [1 ]
Jiao, Botao [1 ]
Yang, Lingkai [1 ]
Cheng, Jian [2 ]
Yang, Shengxiang [3 ]
Tang, Fengzhen [4 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
[2] China Coal Res Inst, Beijing 100013, Peoples R China
[3] De Montfort Univ, Leicester LE1 9BH, Leics, England
[4] Shenyang Inst Automat, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
class imbalance learning; oversampling; manifold learning; overlapping; small disjunction; OPTIMIZATION; ENSEMBLE;
D O I
10.1504/IJBIC.2021.119197
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oversampling is a popular problem-solver for class imbalance learning by generating more minority samples to balance the dataset size of different classes. However, resampling in original space is ineffective for the imbalance datasets with class overlapping or small disjunction. Based on this, a novel oversampling technique based on manifold distance is proposed, in which a new minority sample is produced in terms of the distances among neighbours in manifold space, rather than the Euclidean distance among them. After mapping the original data to its manifold structure, the overlapped majority and minority samples will lie in areas easily being partitioned. In addition, the new samples are generated based on the neighbours locating nearby in manifold space, avoiding the adverse effect of the disjoint minority classes. Following that, an adaptive adjustment method is presented to determine the number of the newly generated minority samples according to the distribution density of the matched-pair data. The experimental results on 48 imbalanced datasets indicate that the proposed oversampling technique has the better classification accuracy.
引用
收藏
页码:131 / 142
页数:12
相关论文
共 50 条
  • [21] SMOTE-WENN: Solving class imbalance and small sample problems by oversampling and distance scaling
    Hongjiao Guan
    Yingtao Zhang
    Min Xian
    H. D. Cheng
    Xianglong Tang
    Applied Intelligence, 2021, 51 : 1394 - 1409
  • [22] SMOTE-WENN: Solving class imbalance and small sample problems by oversampling and distance scaling
    Guan, Hongjiao
    Zhang, Yingtao
    Xian, Min
    Cheng, H. D.
    Tang, Xianglong
    APPLIED INTELLIGENCE, 2021, 51 (03) : 1394 - 1409
  • [23] On the Performance of Oversampling Techniques for Class Imbalance Problems
    Kong, Jiawen
    Rios, Thiago
    Kowalczyk, Wojtek
    Menzel, Stefan
    Back, Thomas
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 84 - 96
  • [24] Note on "A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance"
    Ferrer, Carlos A.
    Aragon, Efren
    INFORMATION SCIENCES, 2023, 630 : 322 - 324
  • [25] Improving Minority Class Recall through a Novel Cluster-Based Oversampling Technique
    Prexawanprasut, Takorn
    Banditwattanawong, Thepparit
    INFORMATICS-BASEL, 2024, 11 (02):
  • [26] Distance-based arranging oversampling technique for imbalanced data
    Dai, Qi
    Liu, Jian-wei
    Zhao, Jia-Liang
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02): : 1323 - 1342
  • [27] Distance-based arranging oversampling technique for imbalanced data
    Qi Dai
    Jian-wei Liu
    Jia-Liang Zhao
    Neural Computing and Applications, 2023, 35 : 1323 - 1342
  • [28] An Oversampling Method for Class Imbalance Problems on Large Datasets
    Rodriguez-Torres, Fredy
    Martinez-Trinidad, Jose F.
    Carrasco-Ochoa, Jesus A.
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [29] Oversampling Algorithm Based on Spatial Distribution of Data Sets for Imbalance Learning
    Liu, Yiran
    Han, Wanjiang
    Wang, Xiaoxiang
    Li, Qi
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2020), 2020, : 45 - 49
  • [30] A novel topic modeling based weighting framework for class imbalance learning
    Santhiappan, Sudarsun
    Chelladurai, Jeshuren
    Ravindran, Balaraman
    PROCEEDINGS OF THE ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA (CODS-COMAD'18), 2018, : 20 - 29