A multi-manifold learning based instance weighting and under-sampling for imbalanced data classification problems

被引:3
|
作者
Feizi, Tayyebe [1 ]
Moattar, Mohammad Hossein [1 ]
Tabatabaee, Hamid [1 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Mashhad Branch, Mashhad, Iran
关键词
Imbalanced data; Classification; Under-sampling; Multi-Manifold learning; REDUCTION ALGORITHM; SMOTE;
D O I
10.1186/s40537-023-00832-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Under-sampling is a technique to overcome imbalanced class problem, however, selecting the instances to be dropped and measuring their informativeness is an important concern. This paper tries to bring up a new point of view in this regard and exploit the structure of data to decide on the importance of the data points. For this purpose, a multi-manifold learning approach is proposed. Manifolds represent the underlying structures of data and can help extract the latent space for data distribution. However, there is no evidence that we can rely on a single manifold to extract the local neighborhood of the dataset. Therefore, this paper proposes an ensemble of manifold learning approaches and evaluates each manifold based on an information loss-based heuristic. Having computed the optimality score of each manifold, the centrality and marginality degrees of samples are computed on the manifolds and weighted by the corresponding score. A gradual elimination approach is proposed, which tries to balance the classes while avoiding a drop in the F measure on the validation dataset. The proposed method is evaluated on 22 imbalanced datasets from the KEEL and UCI repositories with different classification measures. The results of the experiments demonstrate that the proposed approach is more effective than other similar approaches and is far better than the previous approaches, especially when the imbalance ratio is very high.
引用
收藏
页数:36
相关论文
共 50 条
  • [1] A multi-manifold learning based instance weighting and under-sampling for imbalanced data classification problems
    Tayyebe Feizi
    Mohammad Hossein Moattar
    Hamid Tabatabaee
    [J]. Journal of Big Data, 10
  • [2] An Under-sampling Imbalanced Learning of Data Gravitation Based Classification
    Peng, Lizhi
    Yang, Bo
    Chen, Yuehui
    Zhou, Xiaoqing
    [J]. 2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 419 - 425
  • [3] An Active Under-sampling Approach for Imbalanced Data Classification
    Yang, Zeping
    Gao, Daqi
    [J]. 2012 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2012), VOL 2, 2012, : 270 - 273
  • [4] AN IMBALANCED DATA CLASSIFICATION METHOD BASED ON AUTOMATIC CLUSTERING UNDER-SAMPLING
    Deng, Xiaoheng
    Zhong, Weijian
    Ren, Ju
    Zeng, Detian
    Zhang, Honggang
    [J]. 2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
  • [5] A New Hybrid Under-sampling Approach to Imbalanced Classification Problems
    Peng, Chun-Yang
    Park, You-Jin
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [6] M2GDL: Multi-manifold guided dictionary learning based oversampling and data validation for highly imbalanced classification problems
    Feizi, Tayyebe
    Moattar, Mohammad Hossein
    Tabatabaee, Hamid
    [J]. INFORMATION SCIENCES, 2024, 682
  • [7] An Imbalanced Multi-Label Data Ensemble Learning Method Based on Safe Under-Sampling
    Sun, Zhong-Bin
    Diao, Yu-Xuan
    Ma, Su-Yang
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (10): : 3392 - 3408
  • [8] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Sun, Bo
    Chen, Haiyan
    Wang, Jiandong
    Xie, Hua
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 331 - 350
  • [9] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Bo Sun
    Haiyan Chen
    Jiandong Wang
    Hua Xie
    [J]. Frontiers of Computer Science, 2018, 12 : 331 - 350
  • [10] An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification
    Arafat, Md. Yasir
    Hoque, Sabera
    Xu, Shuxiang
    Farid, Dewan Md.
    [J]. 2019 13TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2019,