Early Prediction of University Dropouts - A Random Forest Approach

被引:25
|
作者
Behr, Andreas [1 ]
Giese, Marco [1 ]
Teguim, Herve D. K. [1 ]
Theune, Katja [1 ]
机构
[1] Univ Duisburg Essen, Chair Stat, Essen, Germany
来源
JAHRBUCHER FUR NATIONALOKONOMIE UND STATISTIK | 2020年 / 240卷 / 06期
关键词
student dropout; higher education; dropout prediction; educational data mining; random forest; HIGHER-EDUCATION; ACADEMIC-PERFORMANCE; PANEL ATTRITION; DETERMINANTS; DECISION; COLLEGE; PROBABILITY;
D O I
10.1515/jbnst-2019-0006
中图分类号
F [经济];
学科分类号
02 ;
摘要
We predict university dropout using random forests based on conditional inference trees and on a broad German data set covering a wide range of aspects of student life and study courses. We model the dropout decision as a binary classification (graduate or dropout) and focus on very early prediction of student dropout by stepwise modeling students' transition from school (pre-study) over the study-decision phase (decision phase) to the first semesters at university (early study phase). We evaluate how predictive performance changes over the three models, and observe a substantially increased performance when including variables from the first study experiences, resulting in an AUC (area under the curve) of 0.86. Important predictors are the final grade at secondary school, and also determinants associated with student satisfaction and their subjective academic self-concept and self-assessment. A direct outcome of this research is the provision of information to universitieswishing to implement early warning systems and more personalized counseling services to support students at risk of dropping out during an early stage of study.
引用
收藏
页码:743 / 789
页数:47
相关论文
共 50 条
  • [31] Random Forest Prediction of IPO Underpricing
    Quintana, David
    Saez, Yago
    Isasi, Pedro
    APPLIED SCIENCES-BASEL, 2017, 7 (06):
  • [32] Quantum Circuit for Random Forest Prediction
    Safina L.
    Khadiev K.
    Zinnatullin I.
    Khadieva A.
    Russian Microelectronics, 2023, 52 (Suppl 1) : S384 - S389
  • [33] A proactive approach for random forest
    Cepero-Perez, Nayma
    Moreno-Espino, Mailyn
    Morales, Eduardo F.
    Lopez-Gonzalez, Ariel
    Yanez-Marquez, Cornelio
    Pavon, Juan
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [34] AIS data driven general vessel destination prediction: A random forest based approach
    Zhang, Chengkai
    Bin, Junchi
    Wang, Wells
    Peng, Xiang
    Wang, Rui
    Halldearn, Richard
    Liu, Zheng
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 118
  • [35] Prediction of donor splice sites using random forest with a new sequence encoding approach
    Meher, Prabina Kumar
    Sahu, Tanmaya Kumar
    Rao, Atmakuri Ramakrishna
    BIODATA MINING, 2016, 9
  • [36] Enhancing building energy efficiency using a random forest model: A hybrid prediction approach
    Liu, Yang
    Chen, Hongyu
    Zhang, Limao
    Feng, Zongbao
    ENERGY REPORTS, 2021, 7 : 5003 - 5012
  • [37] Short-term Traffic State Prediction Approach Based on FCM and Random Forest
    Chen Zhonghui
    Ling Xianyao
    Feng Xinxin
    Zheng Haifeng
    Xu Yiwen
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (08) : 1879 - 1886
  • [38] The Prediction of Malignant Middle Cerebral Artery Infarction: A Predicting Approach Using Random Forest
    Chen, Ru
    Deng, Zelin
    Song, Zhi
    JOURNAL OF STROKE & CEREBROVASCULAR DISEASES, 2015, 24 (05): : 958 - 964
  • [39] Prediction of donor splice sites using random forest with a new sequence encoding approach
    Prabina Kumar Meher
    Tanmaya Kumar Sahu
    Atmakuri Ramakrishna Rao
    BioData Mining, 9
  • [40] Random forest for dynamic risk prediction of recurrent events: a pseudo-observation approach
    Loe, Abigail
    Murray, Susan
    Wu, Zhenke
    BIOSTATISTICS, 2025, 26 (01)