Using Large Language Models for Automated Grading of Student Writing about Science

被引:0
|
作者
Impey, Chris [1 ]
Wenger, Matthew [1 ]
Garuda, Nikhil [1 ]
Golchin, Shahriar [2 ]
Stamer, Sarah [1 ]
机构
[1] Univ Arizona, Dept Astron, Tucson, AZ 85721 USA
[2] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
Student writing; Science classes; Online education; Assessment; Machine learning; Large language models; ONLINE; ASTRONOMY; RATER;
D O I
10.1007/s40593-024-00453-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Assessing writing in large classes for formal or informal learners presents a significant challenge. Consequently, most large classes, particularly in science, rely on objective assessment tools such as multiple-choice quizzes, which have a single correct answer. The rapid development of AI has introduced the possibility of using large language models (LLMs) to evaluate student writing. An experiment was conducted using GPT-4 to determine if machine learning methods based on LLMs can match or exceed the reliability of instructor grading in evaluating short writing assignments on topics in astronomy. The audience consisted of adult learners in three massive open online courses (MOOCs) offered through Coursera. One course was on astronomy, the second was on astrobiology, and the third was on the history and philosophy of astronomy. The results should also be applicable to non-science majors in university settings, where the content and modes of evaluation are similar. The data comprised answers from 120 students to 12 questions across the three courses. GPT-4 was provided with total grades, model answers, and rubrics from an instructor for all three courses. In addition to evaluating how reliably the LLM reproduced instructor grades, the LLM was also tasked with generating its own rubrics. Overall, the LLM was more reliable than peer grading, both in aggregate and by individual student, and approximately matched instructor grades for all three online courses. The implication is that LLMs may soon be used for automated, reliable, and scalable grading of student science writing.
引用
收藏
页数:35
相关论文
共 50 条
  • [41] Automated survivorship care plan generation using large language models.
    Pradeepkumar, Jathurshan
    Liebovitz, David
    Patel, Jyoti D.
    Sun, Jimeng
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [42] Automated LOINC Standardization Using Pre-trained Large Language Models
    Tu, Tao
    Loreaux, Eric
    Chesley, Emma
    Lelkes, Adam D.
    Gamble, Paul
    Bellaiche, Mathias
    Seneviratne, Martin
    Chen, Ming-Jun
    MACHINE LEARNING FOR HEALTH, VOL 193, 2022, 193 : 343 - 355
  • [43] Potential impact of large language models on academic writing
    Alahdab, Fares
    BMJ EVIDENCE-BASED MEDICINE, 2024, 29 (03) : 201 - 202
  • [44] The role of digital literacy in student engagement with automated writing evaluation (AWE) feedback on second language writing
    Zhang, Zhe
    Hyland, Ken
    COMPUTER ASSISTED LANGUAGE LEARNING, 2023,
  • [45] Large language models direct automated chemistry laboratory
    Ana Laura Dias
    Tiago Rodrigues
    Nature, 2023, 624 : 530 - 531
  • [46] Automated Repair of Programs from Large Language Models
    National University of Singapore, Singapore
    不详
    不详
    arXiv, 1600,
  • [47] Leveraging Large Language Models for Automated Dialogue Analysis
    Finch, Sarah E.
    Paek, Ellie S.
    Choi, Jinho D.
    24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 202 - 215
  • [48] Large language models direct automated chemistry laboratory
    Dias, Ana Laura
    Rodrigues, Tiago
    NATURE, 2023, 624 (7992) : 530 - 531
  • [49] Automated Disentangled Sequential Recommendation with Large Language Models
    Wang, Xin
    Chen, Hong
    Pan, Zirui
    Zhou, Yuwei
    Guan, Chaoyu
    Sun, Lifeng
    Zhu, Wenwu
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)
  • [50] Automated Repair of Programs from Large Language Models
    Fan, Zhiyu
    Gao, Xiang
    Mirchev, Martin
    Roychoudhury, Abhik
    Tan, Shin Hwei
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1469 - 1481