Predictive Model to Analyze Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques

被引:0
|
作者
Shabnam, Aras S. J. [1 ]
Ramachandriah, Tanuja [1 ]
Haladappa, Manjula S. [1 ]
机构
[1] Bangalore Univ, UVCE, Bangalore, Karnataka, India
来源
ONLINE LEARNING | 2025年 / 29卷 / 01期
关键词
Learners'performance prediction; educational data analytics; predictive models; privacy preservation; synthetic data generation; regression analysis;
D O I
10.24059/olj.v29i1.4390
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
redicting learner performance with precision is critical within educational systems, offering a basis for tailored interventions and instruction. The advent of big data analytics presents an opportunity to employ Machine Learning (ML) techniques to this end. Real-world dataavailability is often hampered by privacy concerns, prompting a shift towards synthetic data generation. This study presents an empirical comparison of real, synthetic, and hybrid (real + synthetic) datasets in forecasting learner performance, deploying an array of regression-based ML algorithms, including Random Forest, Gradient Boosting, Support Vector Regression, XGBoost, and K-nearest Neighbor. Our methodology encompasses the generation of synthetic data via generative model, followed by the application of these algorithms to each dataset. The models are evaluated using precision metrics to assess their predictive accuracy. The study reveals that synthetic data can match real data in terms of predictive performance, with hybrid datasets achieving an accuracy of up to 87.76%, highlighting the effectiveness of combining both data types. These findings highlight the potential of synthetic data as an effective alternative when access to actual data is limited, promoting progress in educational technology andML.
引用
收藏
页数:24
相关论文
共 50 条
  • [11] ML-based Performance Prediction of SDN using Simulated Data from Real and Synthetic Networks
    Dietz, Katharina
    Gray, Nicholas
    Seufert, Michael
    Hossfeld, Tobias
    PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
  • [12] Reporting bias when using real data sets to analyze classification performance
    Yousefi, Mohammadmahdi R.
    Hua, Jianping
    Sima, Chao
    Dougherty, Edward R.
    BIOINFORMATICS, 2010, 26 (01) : 68 - 76
  • [13] GHG Global Emission Prediction of Synthetic N Fertilizers Using Expectile Regression Techniques
    Benghzial, Kaoutar
    Raki, Hind
    Bamansour, Sami
    Elhamdi, Mouad
    Aalaila, Yahya
    Peluffo-Ordonez, Diego H.
    ATMOSPHERE, 2023, 14 (02)
  • [14] Mining real estate listings using ORACLE data warehousing and predictive regression
    Wedyawati, W
    Lu, ML
    PROCEEDINGS OF THE 2004 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI-2004), 2004, : 296 - 301
  • [15] Predictive Data Analytics using Logistic Regression for Licensure Examination Performance
    Juanatas, Irish C.
    Juanatas, Roben A.
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 251 - 255
  • [16] Diabetes prediction model using data mining techniques
    Rastogi R.
    Bansal M.
    Measurement: Sensors, 2023, 25
  • [17] A Predictive Model for Cardiovascular Diseases Using Data Mining Techniques
    Kumar, Avneesh
    Singh, Santosh Kumar
    Sinha, Shruti
    CARDIOMETRY, 2022, (24): : 367 - 372
  • [18] Student Dropout Predictive Model Using Data Mining Techniques
    Amaya, Y.
    Barrientos, E.
    Heredia, D.
    IEEE LATIN AMERICA TRANSACTIONS, 2015, 13 (09) : 3127 - 3134
  • [19] Analysis of Data Mining Techniques for Constructing a Predictive Model for Academic Performance
    Merchan, S. M.
    Duarte, J. A.
    IEEE LATIN AMERICA TRANSACTIONS, 2016, 14 (06) : 2783 - 2788
  • [20] Internationalizing Professional Development: Using Educational Data Mining to Analyze Learners' Performance and Dropouts in a French MOOC
    Chaker, Rawad
    Bachelet, Remi
    INTERNATIONAL REVIEW OF RESEARCH IN OPEN AND DISTRIBUTED LEARNING, 2020, 21 (04): : 199 - +