Enhancing Automated Scoring of Math Self-Explanation Quality Using LLM-Generated Datasets: A Semi-Supervised Approach

被引:4
|
作者
Nakamoto, Ryosuke [1 ]
Flanagan, Brendan [2 ]
Yamauchi, Taisei [1 ]
Dai, Yiling [3 ]
Takami, Kyosuke [3 ,4 ]
Ogata, Hiroaki [3 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
[2] Kyoto Univ, Inst Liberal Arts & Sci, Ctr Innovat Res & Educ Data Sci, Kyoto 6068501, Japan
[3] Kyoto Univ, Acad Ctr Comp & Media Studies, Kyoto 6068501, Japan
[4] Natl Inst Educ Policy Res, Educ Data Sci Ctr, Tokyo 1008951, Japan
关键词
self-explanation; automated scoring; semi-supervised learning; language learning model (LLM); data augmentation; ABSOLUTE ERROR MAE; WORKED-EXAMPLES; RMSE;
D O I
10.3390/computers12110217
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the realm of mathematics education, self-explanation stands as a crucial learning mechanism, allowing learners to articulate their comprehension of intricate mathematical concepts and strategies. As digital learning platforms grow in prominence, there are mounting opportunities to collect and utilize mathematical self-explanations. However, these opportunities are met with challenges in automated evaluation. Automatic scoring of mathematical self-explanations is crucial for preprocessing tasks, including the categorization of learner responses, identification of common misconceptions, and the creation of tailored feedback and model solutions. Nevertheless, this task is hindered by the dearth of ample sample sets. Our research introduces a semi-supervised technique using the large language model (LLM), specifically its Japanese variant, to enrich datasets for the automated scoring of mathematical self-explanations. We rigorously evaluated the quality of self-explanations across five datasets, ranging from human-evaluated originals to ones devoid of original content. Our results show that combining LLM-based explanations with mathematical material significantly improves the model's accuracy. Interestingly, there is an optimal limit to how many synthetic self-explanation data can benefit the system. Exceeding this limit does not further improve outcomes. This study thus highlights the need for careful consideration when integrating synthetic data into solutions, especially within the mathematics discipline.
引用
收藏
页数:18
相关论文
共 10 条
  • [1] A Semi-supervised Approach for Reject Inference in Credit Scoring Using SVMs
    Maldonado, Sebastian
    Paredes, Gonzalo
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2010, 6171 : 558 - 571
  • [2] Quality monitoring for injection moulding process using a semi-supervised learning approach
    Doan Ngoc Chi Nam
    Tran Van Tung
    Yee, Edward Yapp Kien
    IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
  • [3] Enhancing Yarn Quality Wavelength Spectrogram Analysis: A Semi-Supervised Anomaly Detection Approach with Convolutional Autoencoder
    Wang, Haoran
    Han, Zhongze
    Xiong, Xiaoshuang
    Song, Xuewei
    Shen, Chen
    MACHINES, 2024, 12 (05)
  • [4] A semi-supervised learning approach for automated 3D cephalometric landmark identification using computed tomography
    Yun, Hye Sun
    Hyun, Chang Min
    Baek, Seong Hyeon
    Lee, Sang-Hwy
    Seo, Jin Keun
    PLOS ONE, 2022, 17 (09):
  • [5] AnGeL: Fully-Automated Analog Circuit Generator Using a Neural Network Assisted Semi-Supervised Learning Approach
    Fayazi, Morteza
    Taba, Morteza Tavakoli
    Afshari, Ehsan
    Dreslinski, Ronald
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (11) : 4516 - 4529
  • [6] Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection
    Saifullah, Shoffan
    Drezewski, Rafal
    Dwiyanto, Felix Andika
    Aribowo, Agus Sasmito
    Fauziah, Yuli
    Cahyana, Nur Heri
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [7] Optimizing Automated Optical Inspection: An Adaptive Fusion and Semi-Supervised Self-Learning Approach for Elevated Accuracy and Efficiency in Scenarios with Scarce Labeled Data
    Ni, Yu-Shu
    Chen, Wei-Lun
    Liu, Yi
    Wu, Ming-Hsuan
    Guo, Jiun-In
    SENSORS, 2024, 24 (17)
  • [8] Automated Detection of Aortic Stenosis From Single-View 2-Dimensional Echocardiography Using a Semi-Supervised, Contrastive Learning Approach
    Oikonomou, Evangelos K.
    Holste, Gregory
    Mortazavi, Bobak
    Wang, Zhangyang
    Khera, Rohan
    CIRCULATION, 2022, 146
  • [9] Improving quality prediction in radial-axial ring rolling using a semi-supervised approach and generative adversarial networks for synthetic data generation
    Simon Fahle
    Thomas Glaser
    Andreas Kneißler
    Bernd Kuhlenkötter
    Production Engineering, 2022, 16 : 175 - 185
  • [10] Improving quality prediction in radial-axial ring rolling using a semi-supervised approach and generative adversarial networks for synthetic data generation
    Fahle, Simon
    Glaser, Thomas
    Kneissler, Andreas
    Kuhlenkotter, Bernd
    PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2022, 16 (01): : 175 - 185