Denying Evolution Resampling: An Improved Method for Feature Selection on Imbalanced Data

被引:1
|
作者
Quan, Li [1 ]
Gong, Tao [1 ]
Jiang, Kaida [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
classification algorithms; imbalanced data; similarity measure; evolutionary process;
D O I
10.3390/electronics12153212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Imbalanced data classification is an important problem in the field of computer science. Traditional classification algorithms often experience a decrease in accuracy when the data distribution is uneven. Therefore, measures need to be taken to improve the balance of the dataset and enhance the classification accuracy of the model. We have designed a data resampling method to improve the accuracy of classification detection. This method relies on the negative selection process to constrain the data evolution process. By combining the CRITIC method with regression coefficients, we establish crossover selection probabilities for elite genes to achieve an evolutionary resampling process. Based on independent weights, the feature analysis improves by 3%. We evaluated the resampled results on publicly available datasets using traditional logistic regression with cross-validation. Compared to the other resampling models, the F1 score performance of the logistic regression five-fold cross-validation is more stable than the other methods using the two sampling results of the proposed method. The effectiveness of the proposed method is verified based on F1 score evaluation results.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] An effective distance based feature selection approach for imbalanced data
    Shahee, Shaukat Ali
    Ananthakumar, Usha
    APPLIED INTELLIGENCE, 2020, 50 (03) : 717 - 745
  • [42] An effective distance based feature selection approach for imbalanced data
    Shaukat Ali Shahee
    Usha Ananthakumar
    Applied Intelligence, 2020, 50 : 717 - 745
  • [43] Feature selection for imbalanced data with deep sparse autoencoders ensemble
    Massi, Michela Carlotta
    Gasperoni, Francesca
    Ieva, Francesca
    Paganoni, Anna Maria
    STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (03) : 376 - 395
  • [44] Feature selection via minimizing global redundancy for imbalanced data
    Shuhao Huang
    Hongmei Chen
    Tianrui Li
    Hao Chen
    Chuan Luo
    Applied Intelligence, 2022, 52 : 8685 - 8707
  • [45] Resampling Imbalanced Healthcare Data for Predictive Modelling
    Mamilla, Manoj Yadav
    Al-Haddad, Ronak
    Chowdhury, Stiphen
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (02) : 36 - 44
  • [46] GA-Based Feature Selection Method for Imbalanced Data with Application in Radio Signal Recognition
    Limin Du
    Yang Xu
    Jun Liu
    Fangli Ma
    International Journal of Computational Intelligence Systems, 2015, 8 : 39 - 47
  • [47] FEATURE SELECTION AND CLASSIFICATION INTEGRATED METHOD FOR IDENTIFYING CITED TEXT SPANS FOR CITANCES ON IMBALANCED DATA
    Yee, Jen-Yuan
    Tsai, Cheng-Jung
    Hsu, Tien-Yu
    Lin, Jung-Yi
    Cheng, Pei-Cheng
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2021, 34 (04) : 355 - 373
  • [48] A Decoupling and Bidirectional Resampling Method for Multilabel Classification of Imbalanced Data with Label Concurrence
    Zhou, Shuyue
    Li, Xiaobo
    Dong, Yihong
    Xu, Hao
    SCIENTIFIC PROGRAMMING, 2020, 2020
  • [49] Enhancing associative classification on imbalanced data through ontology-based feature extraction and resampling
    Kouhoue, Joel Mba
    Lonlac, Jerry
    Lesage, Alexis
    Doniec, Arnaud
    Lecoeuche, Stephane
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [50] An Approach Based on Resampling and Feature Selection to Improve the Classification of Microarray Data
    Soleymani, Nafiseh
    Moattar, Mohammad Hussein
    2018 6TH IRANIAN JOINT CONGRESS ON FUZZY AND INTELLIGENT SYSTEMS (CFIS), 2018, : 61 - 64