Ensemble models based on CNN and LSTM for dropout prediction in MOOC

被引:6
|
作者
Talebi, Kowsar [1 ]
Torabi, Zeinab [1 ]
Daneshpour, Negin [1 ]
机构
[1] Shahid Rajaee Teacher Training Univ, Fac Comp Engn, Tehran, Iran
关键词
Student dropout; Ensemble models; Convolutional neural network; Long -short term memory; Massive open online courses;
D O I
10.1016/j.eswa.2023.121187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Massive Open Online Courses (MOOCs) have gained a lot of popularity recently. Despite the large number of students enrolled in these courses, a large percentage drop out. Due to this, predicting student dropout has taken on fundamental importance in this area. Predicting dropout early allows course organizers and educators to intervene and provide targeted support to at-risk students. They can offer additional resources, personalized assistance, or interventions tailored to address specific challenges faced by students, increasing their chances of successful course completion. This study first pre-processes the dataset to create a thirty-day correlation matrix for each learner, enabling early dropout prediction by the end of the first week. Then, six new models have been proposed using ensemble classification techniques with Convolutional Neural Network (CNN) and Long-Short Term Memory (LSTM). CNN is used for automatic feature extraction, while LSTM considers the time series aspect of the data to improve early prediction performance. As ensemble classifiers can reduce the variance of prediction errors, using ensemble classifiers in addition to neural networks can enhance accuracy and F1 score without overfitting. The application of these techniques results in more accurate week-by-week dropout prediction. The experimental results on the KDD Cup 2015 dataset (representing XuetangX, a MOOC platform in China with 39 courses, 79,186 students, and 120,542 registered students, with 8,157,277 records collected over 30 days) show that all Bagging models improve performance of their base models. In one of the proposed models (Bagging LSTM-LSTM), at the end of the fifth week, the accuracy reached 94%, and the average accuracy reached 91%. Also, precision and recall reached an average of 92%, and F1 score reached 98%, which shows a significant improvement compared to previous researches.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] An Ensemble Learning Model for Early Dropout Prediction of MOOC Courses
    Kun Ma
    Jiaxuan Zhang
    Yongwei Shao
    Zhenxiang Chen
    Bo Yang
    [J]. 计算机教育, 2023, (12) : 124 - 139
  • [2] CNN autoencoders and LSTM-based reduced order model for student dropout prediction
    Ke Niu
    Guoqiang Lu
    Xueping Peng
    Yuhang Zhou
    Jingni Zeng
    Ke Zhang
    [J]. Neural Computing and Applications, 2023, 35 : 22341 - 22357
  • [3] CNN autoencoders and LSTM-based reduced order model for student dropout prediction
    Niu, Ke
    Lu, Guoqiang
    Peng, Xueping
    Zhou, Yuhang
    Zeng, Jingni
    Zhang, Ke
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (30): : 22341 - 22357
  • [4] CGDC-LSTM: A novel hybrid neural network model for MOOC dropout prediction
    Zhou, Yuhang
    Niu, Ke
    Lv, Haoyi
    Lu, Guoqiang
    Pan, Yijie
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [5] Landslide displacement prediction based on the ICEEMDAN, ApEn and the CNN-LSTM models
    Li-min Li
    Chao-yang Wang
    Zong-zhou Wen
    Jian Gao
    Meng-fan Xia
    [J]. Journal of Mountain Science, 2023, 20 : 1220 - 1231
  • [6] Landslide displacement prediction based on the ICEEMDAN, ApEn and the CNN-LSTM models
    Li, Li-min
    Wang, Chao-yang
    Wen, Zong-zhou
    Gao, Jian
    Xia, Meng-fan
    [J]. JOURNAL OF MOUNTAIN SCIENCE, 2023, 20 (05) : 1220 - 1231
  • [7] Landslide displacement prediction based on the ICEEMDAN, ApEn and the CNN-LSTM models
    LI Li-min
    WANG Chao-yang
    WEN Zong-zhou
    GAO Jian
    XIA Meng-fan
    [J]. Journal of Mountain Science, 2023, 20 (05) : 1220 - 1231
  • [8] Prediction model of paroxysmal atrial fibrillation based on pattern recognition and ensemble CNN-LSTM
    Yang, Ping
    Wang, Dan
    Kagn, Zi-Jian
    Li, Tong
    Fu, Li-Hua
    Yu, Yue-Ren
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (05): : 1039 - 1048
  • [9] MOOC Dropout Prediction Using FWTS-CNN Model Based on Fused Feature Weighting and Time Series
    Zheng, Yafeng
    Gao, Zhanghao
    Wang, Yihang
    Fu, Qian
    [J]. IEEE ACCESS, 2020, 8 : 225324 - 225335
  • [10] Voting and Ensemble Schemes Based on CNN Models for Photo-Based Gender Prediction
    Jhang, Kyoungson
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (04): : 809 - 819