Predictive Model to Analyze Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques

被引:0
|
作者
Shabnam, Aras S. J. [1 ]
Ramachandriah, Tanuja [1 ]
Haladappa, Manjula S. [1 ]
机构
[1] Bangalore Univ, UVCE, Bangalore, Karnataka, India
来源
ONLINE LEARNING | 2025年 / 29卷 / 01期
关键词
Learners'performance prediction; educational data analytics; predictive models; privacy preservation; synthetic data generation; regression analysis;
D O I
10.24059/olj.v29i1.4390
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
redicting learner performance with precision is critical within educational systems, offering a basis for tailored interventions and instruction. The advent of big data analytics presents an opportunity to employ Machine Learning (ML) techniques to this end. Real-world dataavailability is often hampered by privacy concerns, prompting a shift towards synthetic data generation. This study presents an empirical comparison of real, synthetic, and hybrid (real + synthetic) datasets in forecasting learner performance, deploying an array of regression-based ML algorithms, including Random Forest, Gradient Boosting, Support Vector Regression, XGBoost, and K-nearest Neighbor. Our methodology encompasses the generation of synthetic data via generative model, followed by the application of these algorithms to each dataset. The models are evaluated using precision metrics to assess their predictive accuracy. The study reveals that synthetic data can match real data in terms of predictive performance, with hybrid datasets achieving an accuracy of up to 87.76%, highlighting the effectiveness of combining both data types. These findings highlight the potential of synthetic data as an effective alternative when access to actual data is limited, promoting progress in educational technology andML.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Customer Churn Prediction Model using Data Mining techniques
    Mitkees, Ibrahim M. M.
    Badr, Sherif M.
    ElSeddawy, Ahmed Ibrahim Bahgat
    2017 13TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2017, : 262 - 268
  • [32] A Predictive Model for Heart Disease Detection Using Data Mining Techniques
    Premsmith, Jakkrit
    Ketmaneechairat, Hathairat
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2021, 12 (01) : 14 - 20
  • [33] Construction of financial distress prediction models using stepwise regression and data mining techniques
    Hsieh, Yung-Ming (armin@scu.edu.tw), 2016, ICIC Express Letters Office (07):
  • [34] Service Life Prediction of Painted Renderings Using Maintenance Data through Regression Techniques
    Petersen, Andre
    Silva, Ana
    Gonzalez, Marco
    BUILDINGS, 2023, 13 (03)
  • [35] Nonlinear censored regression using synthetic data
    Delecroix, Michel
    Lopez, Olivier
    Patilea, Valentin
    SCANDINAVIAN JOURNAL OF STATISTICS, 2008, 35 (02) : 248 - 265
  • [36] Performance Prediction for Steel Bridges Using SHM Data and Bayesian Dynamic Regression Linear Model: A Novel Approach
    Qu, Guang
    Sun, Limin
    JOURNAL OF BRIDGE ENGINEERING, 2024, 29 (07)
  • [37] Prediction of Academic Performance of Alcoholic Students Using Data Mining Techniques
    Sasikala, T.
    Rajesh, M.
    Sreevidya, B.
    COGNITIVE INFORMATICS AND SOFT COMPUTING, 2020, 1040 : 141 - 148
  • [38] Prediction of students' performance in elective subject using data mining techniques
    Sulaiman, S.
    Shibghatullah, A. S.
    Rahman, N. A.
    PROCEEDINGS OF MECHANICAL ENGINEERING RESEARCH DAY 2017 (MERD), 2017, : 222 - 224
  • [39] Performance comparison and future estimation of time series data using predictive data mining techniques
    Tanwar, Harshita
    Kakkar, Misha
    2017 1ST IEEE INTERNATIONAL CONFERENCE ON DATA MANAGEMENT, ANALYTICS AND INNOVATION (ICDMAI), 2017, : 9 - 12
  • [40] LinChemIn: SynGraph—a data model and a toolkit to analyze and compare synthetic routes
    Marta Pasquini
    Marco Stenta
    Journal of Cheminformatics, 15