Comparing Different Oversampling Methods in Predicting Multi-Class Educational Datasets Using Machine Learning Techniques

被引:0
|
作者
Tariq, Muhammad Arham [1 ]
Sargano, Allah Bux [2 ]
Iftikhar, Muhammad Aksam [2 ]
Habib, Zulfiqar [2 ]
机构
[1] Univ Cent Punjab, Dept Comp Sci, Lahore, Pakistan
[2] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad, Pakistan
关键词
Imbalance educational datasets; Students' academic performance; Educational data mining; Data re-sampling; SMOTE;
D O I
10.2478/cait-2023-0044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Predicting students' academic performance is a critical research area, yet imbalanced educational datasets, characterized by unequal academic-level representation, present challenges for classifiers. While prior research has addressed the imbalance in binary-class datasets, this study focuses on multi-class datasets. A comparison of ten resampling methods (SMOTE, Adasyn, Distance SMOTE, BorderLineSMOTE, KmeansSMOTE, SVMSMOTE, LN SMOTE, MWSMOTE, Safe Level SMOTE, and SMOTETomek) is conducted alongside nine classification models: K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Support Vector Machine (SVM), Logistic Regression (LR), Extra Tree (ET), Random Forest (RT), Extreme Gradient Boosting (XGB), and Ada Boost (AdaB). Following a rigorous evaluation, including hyperparameter tuning and 10 fold cross-validations, KNN with SmoteTomek attains the highest accuracy of 83.7%, as demonstrated through an ablation study. These results emphasize SMOTETomek's effectiveness in mitigating class imbalance in educational datasets and highlight KNN's potential as an educational data mining classifier.
引用
收藏
页码:199 / 212
页数:14
相关论文
共 50 条
  • [1] Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets
    Saez, Jose A.
    Krawczyk, Bartosz
    Wozniak, Michal
    [J]. PATTERN RECOGNITION, 2016, 57 : 164 - 178
  • [2] Comparing Different Resampling Methods in Predicting Students Performance Using Machine Learning Techniques
    Ghorbani, Ramin
    Ghousi, Rouzbeh
    [J]. IEEE ACCESS, 2020, 8 : 67899 - 67911
  • [3] Multi-Class Network Anomaly Detection Using Machine Learning Techniques
    Gunupusala, Satyanarayana
    Kaila, Shahu Chatrapathi
    [J]. CONTEMPORARY MATHEMATICS, 2024, 5 (02): : 2335 - 2352
  • [4] Comparison of machine learning methods in predicting binary and multi-class occupational accident severity
    Recal, Fusun
    Demirel, Tufan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (06) : 10981 - 10998
  • [5] Bearing Fault Classification Using Multi-Class Machine Learning (ML) Techniques
    Sujatha, C.
    Mohan, Aravind
    [J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (01)
  • [6] Comparing and Analysis of Different Optimization Techniques on Sparse Multi-Class Data
    Panda, Digbijay
    Singh, Sanika
    Mukherjee, Saurabh
    Chakraborty, Sudeshna
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 528 - 531
  • [7] Dual Approach to Handling Imbalanced Class in Datasets Using Oversampling and Ensemble Learning Techniques
    Pristyanto, Yoga
    Nugraha, Anggit Ferdita
    Pratama, Irfan
    Dahlan, Akhmad
    Wirasakti, Lucky Adhikrisna
    [J]. PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021), 2021,
  • [8] A Multi-class Classification Approach for Weather Forecasting with Machine Learning Techniques
    Dritsas, Elias
    Trigka, Maria
    Mylonas, Phivos
    [J]. 2022 17TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION & PERSONALIZATION (SMAP 2022), 2022, : 81 - 85
  • [9] Multi-class Sports News Categorization using Machine Learning Techniques: Resource Creation and Evaluation
    Barua, Adrita
    Sharif, Omar
    Hoque, Mohammed Moshiul
    [J]. 10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 112 - 121
  • [10] Multi-class JPEG Steganalysis Using Extreme Learning Machine
    Bhasin, Veenu
    Bedi, Punam
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1948 - 1952