Student Performance Prediction with Decision Tree Ensembles and Feature Selection Techniques

被引:0
|
作者
Ahmad, Amir [1 ]
Ray, Santosh [2 ]
Khan, Md. Tabrej [3 ]
Nawaz, Ali [1 ]
机构
[1] United Arab Emirates Univ, Coll Informat Technol, Al Ain, U Arab Emirates
[2] Liwa Coll, Fac Informat Technol, Abu Dhabi, U Arab Emirates
[3] Pacific Acad Higher Educ & Res Univ, Fac Comp Sci, Udaipur, Rajasthan, India
关键词
Student dropout prediction; classification; ensembles; decision trees; imbalanced class; feature selection; CLASSIFICATION; PROJECTION; SMOTE;
D O I
10.1142/S0219649225500169
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
The prevalence of student dropout in academic settings is a serious issue that affects individuals and society as a whole. Timely intervention and support can be provided to such students if we get an accurate prediction of student performance. However, class imbalance and data complexity in education data are major challenges for traditional predictive analytics. Our research focusses on utilising machine learning techniques to predict student performance while handling imbalanced datasets. To address the imbalanced class problem, we employed both oversampling and undersampling techniques in our decision tree ensemble methods for the risk classification of prospective students. The effectiveness of classifiers was evaluated by varying the sizes of the ensembles and the oversampling and undersampling ratios. Additionally, we conducted experiments to integrate the feature selection processes with the best ensemble classifiers to further enhance the prediction. Based on the extensive experimentation, we concluded that ensemble methods such as Random Forest, Bagging, and Random Undersampling Boosting perform well in terms of performance measures such as Recall, Precision, F1-score, Area Under the Receiver Operating Characteristic Curve, and Geometric Mean. The F1-score of 0.849 produced by the Random Undersampling Boost classifier in conjunction with the Least Absolute Shrinkage and Selection Operator feature selection method indicates that this ensemble produces the best results.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Feature selection for ensembles
    Opitz, DW
    SIXTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-99)/ELEVENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE (IAAI-99), 1999, : 379 - 384
  • [22] Advancing educational data mining for enhanced student performance prediction: a fusion of feature selection algorithms and classification techniques with dynamic feature ensemble evolution
    Saleem Malik
    S. Gopal Krishna Patro
    Chandrakanta Mahanty
    Rashmi Hegde
    Quadri Noorulhasan Naveed
    Ayodele Lasisi
    Abdulrajak Buradi
    Addisu Frinjo Emma
    Naoufel Kraiem
    Scientific Reports, 15 (1)
  • [23] EEG feature selection method based on decision tree
    Duan, Lijuan
    Ge, Hui
    Ma, Wei
    Miao, Jun
    BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1019 - S1025
  • [24] Feature Selection as an Improving Step for Decision Tree Construction
    Esmaeili, Mahdi
    Gabor, Fazekas
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (IACSIT ICMLC 2009), 2009, : 35 - 39
  • [25] Feature Selection via Decision Tree Surrogate Splits
    Springer, Clayton
    Kegelmeyer, W. Philip
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3682 - 3686
  • [26] Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity
    Cao, Dong-Sheng
    Xu, Qing-Song
    Liang, Yi-Zeng
    Chen, Xian
    Li, Hong-Dong
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 103 (02) : 129 - 136
  • [27] Financial distress prediction using a corrected feature selection measure and gradient boosted decision tree
    Qian, Hongyi
    Wang, Baohui
    Yuan, Minghe
    Gao, Songfeng
    Song, You
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 190
  • [28] The impact of unsupervised feature selection techniques on the performance and interpretation of defect prediction models
    Zhiqiang Li
    Wenzhi Zhu
    Hongyu Zhang
    Yuantian Miao
    Jie Ren
    Automated Software Engineering, 2025, 32 (2)
  • [29] Student academic performance prediction model using decision tree and fuzzy genetic algorithm
    Hamsa, Hashmia
    Indiradevi, Simi
    Kizhakkethottam, Jubilant J.
    1ST GLOBAL COLLOQUIUM ON RECENT ADVANCEMENTS AND EFFECTUAL RESEARCHES IN ENGINEERING, SCIENCE AND TECHNOLOGY - RAEREST 2016, 2016, 25 : 326 - 332
  • [30] On the value of filter feature selection techniques in homogeneous ensembles effort estimation
    Hosni, Mohamed
    Idri, Ali
    Abran, Alain
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (06)