The Role of Machine Learning in Identifying Students At-Risk and Minimizing Failure

被引:3
|
作者
Pek, Reyhan Zeynep [1 ]
Ozyer, Sibel Tariyan [2 ]
Elhage, Tarek [3 ]
Ozyer, Tansel [2 ]
Alhajj, Reda [1 ,4 ,5 ]
机构
[1] Istanbul Medipol Univ, Dept Comp Engn, TR-34810 Istanbul, Turkiye
[2] Ankara Medipol Univ, Dept Comp Engn, TR-06050 Ankara, Turkiye
[3] ABC Private Sch, Abu Dhabi, U Arab Emirates
[4] Univ Calgary, Dept Comp Sci, Calgary, AB T2N 1N4, Canada
[5] Univ Southern Denmark, Dept Heath Informat, DK-5230 Odense, Denmark
关键词
Predictive models; Data models; Machine learning; Stacking; Machine learning algorithms; Prediction algorithms; Data mining; At-risk students; classification; dropout prediction; hybrid model; machine learning techniques; stacking ensemble model; student performance prediction; ACADEMIC-PERFORMANCE; DROPOUT PREDICTION; SCHOOL;
D O I
10.1109/ACCESS.2022.3232984
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Education is very important for students' future success. The performance of students can be supported by the extra assignments and projects given by the instructors for students with low performance. However, a major problem is that students at-risk cannot be identified early. This situation is being investigated by various researchers using Machine Learning techniques. Machine learning is used in a variety of areas and has also begun to be used to identify students at-risk early and to provide support by instructors. This research paper discusses the performance results found using Machine learning algorithms to identify at-risk students and minimize student failure. The main purpose of this project is to create a hybrid model using the ensemble stacking method and to predict at-risk students using this model. We used machine learning algorithms such as Naive Bayes, Random Forest, Decision Tree, K-Nearest Neighbors, Support Vector Machine, AdaBoost Classifier and Logistic Regression in this project. The performance of each machine learning algorithm presented in the project was measured with various metrics. Thus, the hybrid model by combining algorithms that give the best prediction results is presented in this study. The data set containing the demographic and academic information of the students was used to train and test the model. In addition, a web application developed for the effective use of the hybrid model and for obtaining prediction results is presented in the report. In the proposed method, it has been realized that stratified k-fold cross validation and hyperparameter optimization techniques increased the performance of the models. The hybrid ensemble model was tested with a combination of two different datasets to understand the importance of the data features. In first combination, the accuracy of the hybrid model was obtained as 94.8% by using both demographic and academic data. In the second combination, when only academic data was used, the accuracy of the hybrid model increased to 98.4%. This study focuses on predicting the performance of at-risk students early. Thus, teachers will be able to provide extra assistance to students with low performance.
引用
收藏
页码:1224 / 1243
页数:20
相关论文
共 50 条
  • [1] Identifying Students At-Risk with an Ensemble of Machine Learning Algorithms
    Soobramoney, Ranjin
    Singh, Alveen
    [J]. 2019 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2019,
  • [2] Identifying At-Risk Students for Early Intervention-A Probabilistic Machine Learning Approach
    Nimy, Eli
    Mosia, Moeketsi
    Chibaya, Colin
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [3] Early Detection At-Risk Students using Machine Learning
    Pongpaichet, Siripen
    Jankapor, Sawarin
    Janchai, Sarun
    Tongsanit, Todsaporn
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 283 - 287
  • [4] The effectiveness of learning analytics for identifying at-risk students in higher education
    Foster, Ed
    Siddle, Rebecca
    [J]. ASSESSMENT & EVALUATION IN HIGHER EDUCATION, 2020, 45 (06) : 842 - 854
  • [5] A Scalable Machine Learning-based Ensemble Approach to Enhance the Prediction Accuracy for Identifying Students at-Risk
    Verma, Swati
    Yadav, Rakesh Kumar
    Kholiya, Kuldeep
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 185 - 192
  • [6] Identifying At-Risk Students in Online Learning by Analysing Learning Behaviour: A Systematic Review
    Na, Kew Si
    Tasir, Zaidatun
    [J]. 2017 IEEE CONFERENCE ON BIG DATA AND ANALYTICS (ICBDA), 2017, : 118 - 123
  • [7] Optimized Screening for At-Risk Students in Mathematics: A Machine Learning Approach
    Bulut, Okan
    Cormier, Damien C.
    Yildirim-Erbasli, Seyma Nur
    [J]. INFORMATION, 2022, 13 (08)
  • [8] Identifying at-risk students in higher education
    Duarte, Rogerio
    Ramos-Pires, Antonio
    Goncalves, Helena
    [J]. TOTAL QUALITY MANAGEMENT & BUSINESS EXCELLENCE, 2014, 25 (7-8) : 944 - 952
  • [9] An Interpretable Pipeline for Identifying At-Risk Students
    Pei, Bo
    Xing, Wanli
    [J]. JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2022, 60 (02) : 380 - 405
  • [10] Using machine learning to identify the most at-risk students in physics classes
    Yang, Jie
    DeVore, Seth
    Hewagallage, Dona
    Miller, Paul
    Ryan, Qing X.
    Stewart, John
    [J]. PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH, 2020, 16 (02):