A novel oversampling technique based on the manifold distance for class imbalance learning

被引:2
|
作者
Guo, Yinan [1 ]
Jiao, Botao [1 ]
Yang, Lingkai [1 ]
Cheng, Jian [2 ]
Yang, Shengxiang [3 ]
Tang, Fengzhen [4 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
[2] China Coal Res Inst, Beijing 100013, Peoples R China
[3] De Montfort Univ, Leicester LE1 9BH, Leics, England
[4] Shenyang Inst Automat, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
class imbalance learning; oversampling; manifold learning; overlapping; small disjunction; OPTIMIZATION; ENSEMBLE;
D O I
10.1504/IJBIC.2021.119197
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oversampling is a popular problem-solver for class imbalance learning by generating more minority samples to balance the dataset size of different classes. However, resampling in original space is ineffective for the imbalance datasets with class overlapping or small disjunction. Based on this, a novel oversampling technique based on manifold distance is proposed, in which a new minority sample is produced in terms of the distances among neighbours in manifold space, rather than the Euclidean distance among them. After mapping the original data to its manifold structure, the overlapped majority and minority samples will lie in areas easily being partitioned. In addition, the new samples are generated based on the neighbours locating nearby in manifold space, avoiding the adverse effect of the disjoint minority classes. Following that, an adaptive adjustment method is presented to determine the number of the newly generated minority samples according to the distribution density of the matched-pair data. The experimental results on 48 imbalanced datasets indicate that the proposed oversampling technique has the better classification accuracy.
引用
收藏
页码:131 / 142
页数:12
相关论文
共 50 条
  • [1] Manifold Distance-Based Over-Sampling Technique for Class Imbalance Learning
    Yang, Lingkai
    Guo, Yinan
    Cheng, Jian
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10071 - 10072
  • [2] A Boosting based Adaptive Oversampling Technique for Treatment of Class Imbalance
    Devi, Debashree
    Biswas, Saroj K.
    Purkayastha, Biswajit
    2019 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2019), 2019,
  • [3] Extreme Anomalous Oversampling Technique for Class Imbalance
    Chiamanusorn, Chittima
    Sinapiromsaran, Krung
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 341 - 345
  • [4] CARBO: Clustering and rotation based oversampling for class imbalance learning
    Paul, Mahit Kumar
    Pal, Biprodip
    Sattar, A. H. M. Sarowar
    Siddique, A. S. M. Mustakim Rahman
    Hasan, Md. Al Mehedi
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [5] Stop Oversampling for Class Imbalance Learning: A Review
    Tarawneh, Ahmad S.
    Hassanat, Ahmad B.
    Altarawneh, Ghada Awad
    Almuhaimeed, Abdullah
    IEEE ACCESS, 2022, 10 : 47643 - 47660
  • [6] Correlation-based Oversampling aided Cost Sensitive Ensemble learning technique for Treatment of Class Imbalance
    Devi, Debashree
    Biswas, Saroj K.
    Purkayastha, Biswajit
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (01) : 143 - 174
  • [7] Hellinger Distance Based Oversampling Method to Solve Multi-class Imbalance Problem
    Kumari, Amisha
    Thakar, Urjita
    2017 7TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2017, : 137 - 141
  • [8] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
    Li, Junnan
    Zhu, Qingsheng
    Wu, Quanwang
    Fan, Zhu
    INFORMATION SCIENCES, 2021, 565 : 438 - 455
  • [9] Clustering-Based Oversampling Algorithm for Multi-class Imbalance Learning
    Zhao, Haixia
    Wu, Jian
    JOURNAL OF CLASSIFICATION, 2025, 42 (01) : 205 - 220
  • [10] Support Vector based Oversampling Technique for Handling Class Imbalance in Software Defect Prediction
    Malhotra, Ruchika
    Agrawal, Vaibhav
    Pal, Vedansh
    Agarwal, Tushar
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 1078 - 1083