A Novel Weighted Ensemble Method to Overcome the Impact of Under-fitting and Over-fitting on the Classification Accuracy of the Imbalanced Data Sets

被引:1
|
作者
Fatima, Ghulam [1 ]
Saeed, Sana [1 ]
机构
[1] Univ Punjab, Coll Stat & Actuarial Sci, Lahore, Pakistan
关键词
Imbalanced Data Sets; Under-Fitting; Over-Fitting; Over-Sampling Techniques; Ensemble Method; Weighted Method; SMOTE;
D O I
10.18187/pjsor.v17i2.3640
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In the data mining communal, imbalanced class dispersal data sets have established mounting consideration. The evolving field of data mining and information discovery seeks to establish precise and effective computational tools for the investigation of such data sets to excerpt innovative facts from statistics. Sampling methods re-balance the imbalanced data sets consequently improve the enactment of classifiers. For the classification of the imbalanced data sets, over-fitting and under-fitting are the two striking problems. In this study, a novel weighted ensemble method is anticipated to diminish the influence of over-fitting and under-fitting while classifying these kinds of data sets. Forty imbalanced data sets with varying imbalance ratios are engaged to conduct a comparative study. The enactment of the projected method is compared with four customary classifiers including decision tree(DT), k-nearest neighbor (KNN), support vector machines (SVM), and neural network (NN). This evaluation is completed with two over-sampling procedures, an adaptive synthetic sampling approach (ADASYN) and a synthetic minority over-sampling (SMOTE) technique. The projected scheme remained efficacious in diminishing the impact of over-fitting and under-fitting on the classification of these data sets.
引用
收藏
页码:483 / 496
页数:14
相关论文
共 14 条
  • [1] Under-Fitting and Over-Fitting: The Performance of Bayesian Model Selection and Fit Indices in SEM
    Depaoli, Sarah
    Winter, Sonja D.
    Liu, Haiyan
    [J]. STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2024, 31 (04) : 604 - 625
  • [2] A Baseline Accuracy Classification Method to Over-come the Over-Fitting Problem for Class- Imbalanced Defect-Prone Datasets Model
    Shaikh, Salahuddin
    Liu Changan
    Malik, Maaz Rasheed
    [J]. 2020 14TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON2020), 2020,
  • [3] Agricultural Case Studies of Classification Accuracy, Spectral Resolution, and Model Over-Fitting
    Nansen, Christian
    Geremias, Leandro Delalibera
    Xue, Yingen
    Huang, Fangneng
    Parra, Jose Roberto
    [J]. APPLIED SPECTROSCOPY, 2013, 67 (11) : 1332 - 1338
  • [4] Effects of single and multiple imputation strategies on addressing over-fitting issues caused by imbalanced data from various scenarios
    Yang, Jiaxi
    Wang, Yihan
    Yang, Ye
    Ding, Kai
    Na, Chongning
    Yang, Yao
    [J]. APPLIED INTELLIGENCE, 2024, 54 (03) : 2812 - 2830
  • [5] Effects of single and multiple imputation strategies on addressing over-fitting issues caused by imbalanced data from various scenarios
    Jiaxi Yang
    Yihan Wang
    Ye Yang
    Kai Ding
    Chongning Na
    Yao Yang
    [J]. Applied Intelligence, 2024, 54 : 2812 - 2830
  • [6] An Effective Over-sampling Method for Imbalanced Data Sets Classification
    Zhai Yun
    Ma Nan
    Ruan Da
    An Bing
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2011, 20 (03) : 489 - 494
  • [7] Clustering boundary over-sampling classification method for imbalanced data sets
    Lou, Xiao-Jun
    Sun, Yu-Xuan
    Liu, Hai-Tao
    [J]. Liu, H.-T. (liuhaitao@wsn.cn), 1600, Zhejiang University (47): : 944 - 950
  • [8] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Bo Sun
    Haiyan Chen
    Jiandong Wang
    Hua Xie
    [J]. Frontiers of Computer Science, 2018, 12 : 331 - 350
  • [9] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Sun, Bo
    Chen, Haiyan
    Wang, Jiandong
    Xie, Hua
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 331 - 350
  • [10] Novel transfer learning approach to achieve high prediction accuracy for skin cancer classification in imbalanced data sets
    Chan, T. C.
    Lai, F.
    [J]. JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2023, 143 (05) : S36 - S36