Can large language models write reflectively

被引:0
|
作者
Li Y. [1 ]
Sha L. [1 ]
Yan L. [1 ]
Lin J. [1 ]
Raković M. [1 ]
Galbraith K. [2 ]
Lyons K. [3 ]
Gašević D. [1 ]
Chen G. [1 ]
机构
[1] Centre for Learning Analytics, Monash University
[2] Experiential Development and Graduate Education, Faculty of Pharmacy and Pharmaceutical Sciences, Monash University
[3] Centre for Digital Transformation of Health, University of Melbourne
关键词
ChatGPT; Generative language model; Natural language processing; Reflective writing;
D O I
10.1016/j.caeai.2023.100140
中图分类号
学科分类号
摘要
Generative Large Language Models (LLMs) demonstrate impressive results in different writing tasks and have already attracted much attention from researchers and practitioners. However, there is limited research to investigate the capability of generative LLMs for reflective writing. To this end, in the present study, we have extensively reviewed the existing literature and selected 9 representative prompting strategies for ChatGPT – the chatbot based on state-of-art generative LLMs to generate a diverse set of reflective responses, which are combined with student-written reflections. Next, those responses were evaluated by experienced teaching staff following a theory-aligned assessment rubric that was designed to evaluate student-generated reflections in several university-level pharmacy courses. Furthermore, we explored the extent to which Deep Learning classification methods can be utilised to automatically differentiate between reflective responses written by students vs. reflective responses generated by ChatGPT. To this end, we harnessed BERT, a state-of-art Deep Learning classifier, and compared the performance of this classifier to the performance of human evaluators and the AI content detector by OpenAI. Following our extensive experimentation, we found that (i) ChatGPT may be capable of generating high-quality reflective responses in writing assignments administered across different pharmacy courses, (ii) the quality of automatically generated reflective responses was higher in all six assessment criteria than the quality of student-written reflections; and (iii) a domain-specific BERT-based classifier could effectively differentiate between student-written and ChatGPT-generated reflections, greatly surpassing (up to 38% higher across four accuracy metrics) the classification performed by experienced teaching staff and general-domain classifier, even in cases where the testing prompts were not known at the time of model training. © 2023 The Authors
引用
收藏
相关论文
共 50 条
  • [1] How to write effective prompts for large language models
    Lin, Zhicheng
    [J]. NATURE HUMAN BEHAVIOUR, 2024, 8 (4) : 611 - 615
  • [2] How to write effective prompts for large language models
    Zhicheng Lin
    [J]. Nature Human Behaviour, 2024, 8 : 611 - 615
  • [3] Using large language models to write theses and dissertations
    O'Leary, Daniel E.
    [J]. INTELLIGENT SYSTEMS IN ACCOUNTING FINANCE & MANAGEMENT, 2023, 30 (04): : 228 - 234
  • [4] Can large language models reason and plan?
    Kambhampati, Subbarao
    [J]. ANNALS OF THE NEW YORK ACADEMY OF SCIENCES, 2024, 1534 (01) : 15 - 18
  • [5] Can large language models understand molecules?
    Sadeghi, Shaghayegh
    Bui, Alan
    Forooghi, Ali
    Lu, Jianguo
    Ngom, Alioune
    [J]. BMC BIOINFORMATICS, 2024, 25 (01):
  • [6] Can large language models generate geospatial code?
    State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
    不详
    [J]. arXiv, 1600,
  • [7] Can Large Language Models Assist in Hazard Analysis?
    Diemert, Simon
    Weber, Jens H.
    [J]. COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2023 WORKSHOPS, 2023, 14182 : 410 - 422
  • [8] Large Language Models Can Implement Policy Iteration
    Brooks, Ethan
    Walls, Logan
    Lewis, Richard L.
    Singh, Satinder
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Can Large Language Models Be an Alternative to Human Evaluation?
    Chiang, Cheng-Han
    Lee, Hung-yi
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15607 - 15631
  • [10] CAN LARGE LANGUAGE MODELS GENERATE CONCEPTUAL HEALTH ECONOMIC MODELS?
    Chhatwal, J.
    Yildirim, I
    Balta, D.
    Ermis, T.
    Tenkin, S.
    Samur, S.
    Ayer, T.
    [J]. VALUE IN HEALTH, 2024, 27 (06) : S123 - S123