Large language models: a new frontier in paediatric cataract patient education

被引:3
|
作者
Dihan, Qais [1 ,2 ]
Chauhan, Muhammad Z. [2 ]
Eleiwa, Taher K. [3 ]
Brown, Andrew D. [4 ]
Hassan, Amr K. [5 ]
Khodeiry, Mohamed M. [6 ]
Elsheikh, Reem H. [2 ]
Oke, Isdin [7 ]
Nihalani, Bharti R. [7 ]
VanderVeen, Deborah K. [7 ]
Sallam, Ahmed B. [2 ]
Elhusseiny, Abdelrahman M. [2 ,7 ]
机构
[1] Rosalind Franklin Univ Med & Sci, Chicago Med Sch, N Chicago, IL USA
[2] Univ Arkansas Med Sci, Dept Ophthalmol, Little Rock, AR 72205 USA
[3] Benha Univ, Dept Ophthalmol, Banha, Egypt
[4] Univ Arkansas Med Sci, Little Rock, AR USA
[5] South Valley Univ, Dept Ophthalmol, Qena, Egypt
[6] Univ Kentucky, Dept Ophthalmol, Lexington, KY USA
[7] Harvard Med Sch, Boston Childrens Hosp, Dept Ophthalmol, Boston, MA 02115 USA
关键词
Medical Education; Public health; Epidemiology; Child health (paediatrics); CHILDHOOD; READABILITY; INFORMATION; QUALITY; HEALTH;
D O I
10.1136/bjo-2024-325252
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Background/aims This was a cross-sectional comparative study. We evaluated the ability of three large language models (LLMs) (ChatGPT-3.5, ChatGPT-4, and Google Bard) to generate novel patient education materials (PEMs) and improve the readability of existing PEMs on paediatric cataract. Methods We compared LLMs' responses to three prompts. Prompt A requested they write a handout on paediatric cataract that was 'easily understandable by an average American.' Prompt B modified prompt A and requested the handout be written at a 'sixth-grade reading level, using the Simple Measure of Gobbledygook (SMOG) readability formula.' Prompt C rewrote existing PEMs on paediatric cataract 'to a sixth-grade reading level using the SMOG readability formula'. Responses were compared on their quality (DISCERN; 1 (low quality) to 5 (high quality)), understandability and actionability (Patient Education Materials Assessment Tool (>= 70%: understandable, >= 70%: actionable)), accuracy (Likert misinformation; 1 (no misinformation) to 5 (high misinformation) and readability (SMOG, Flesch-Kincaid Grade Level (FKGL); grade level <7: highly readable). Results All LLM-generated responses were of high-quality (median DISCERN >= 4), understandability (>= 70%), and accuracy (Likert=1). All LLM-generated responses were not actionable (<70%). ChatGPT-3.5 and ChatGPT-4 prompt B responses were more readable than prompt A responses (p<0.001). ChatGPT-4 generated more readable responses (lower SMOG and FKGL scores; 5.59 +/- 0.5 and 4.31 +/- 0.7, respectively) than the other two LLMs (p<0.001) and consistently rewrote them to or below the specified sixth-grade reading level (SMOG: 5.14 +/- 0.3). Conclusion LLMs, particularly ChatGPT-4, proved valuable in generating high-quality, readable, accurate PEMs and in improving the readability of existing materials on paediatric cataract.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Improving Patient Engagement: Is There a Role for Large Language Models?
    Kouzy, Ramez
    Bitterman, Danielle S.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2024, 120 (03): : 639 - 641
  • [32] Assessment of Large Language Models in Cataract Care Information Provision: A Quantitative Comparison
    Su, Zichang
    Jin, Kai
    Wu, Hongkang
    Luo, Ziyao
    Grzybowski, Andrzej
    Ye, Juan
    OPHTHALMOLOGY AND THERAPY, 2025, 14 (01) : 103 - 116
  • [33] Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing
    Tsigkanos, Christos
    Rani, Pooja
    Mueller, Sebastian
    Kehrer, Timo
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 678 - 682
  • [34] Paediatric emergency medicine - the new frontier
    Goldman, Ran D.
    PAEDIATRICS & CHILD HEALTH, 2007, 12 (06) : 445 - 445
  • [35] Ethics, Governance, and User Mental Models for Large Language Models in Computing Education
    Zhou, Kyrie Zhixuan
    Kilhoffer, Zachary
    Sanfilippo, Madelyn Rose
    Underwood, Ted
    Gumusel, Ece
    Wei, Mengyi
    Choudhry, Abhinav
    Xiong, Jinjun
    XRDS: Crossroads, 2024, 31 (01): : 46 - 51
  • [36] Large language models (ChatGPT) in medical education: Embrace or abjure?
    Luke, Nathasha
    Taneja, Reshma
    Ban, Kenneth
    Samarasekera, Dujeepa
    Yap, Celestial T.
    ASIA PACIFIC SCHOLAR, 2023, 8 (04): : 50 - 52
  • [37] BeGrading: large language models for enhanced feedback in programming education
    Mina Yousef
    Kareem Mohamed
    Walaa Medhat
    Ensaf Hussein Mohamed
    Ghada Khoriba
    Tamer Arafa
    Neural Computing and Applications, 2025, 37 (2) : 1027 - 1040
  • [38] A systematic review of large language models and their implications in medical education
    Lucas, Harrison C.
    Upperman, Jeffrey S.
    Robinson, Jamie R.
    MEDICAL EDUCATION, 2024, 58 (11) : 1276 - 1285
  • [39] Large language models for sustainable assessment and feedback in higher education
    Agostini, Daniele
    Picasso, Federica
    INTELLIGENZA ARTIFICIALE, 2024, 18 (01) : 121 - 138
  • [40] Efficacy of large language models and their potential in Obstetrics and Gynecology education
    Eoh, Kyung Jin
    Kwon, Gu Yeun
    Lee, Eun Jin
    Lee, Joonho
    Lee, Inha
    Kim, Young Tae
    Nam, Eun Ji
    OBSTETRICS & GYNECOLOGY SCIENCE, 2024, 67 (06) : 550 - 556