Using Large Language Models to Generate Educational Materials on Childhood Glaucoma

被引:3
|
作者
Dihan, Qais [1 ,2 ]
Chauhan, Muhammad z. [2 ]
Eleiwa, Taher k. [3 ]
Hassan, Amr k. [4 ]
Sallam, Ahmed b. [2 ,5 ]
Khouri, Albert s. [6 ]
Chang, Ta c. [7 ]
Elhusseiny, Abdelrahman m. [2 ,8 ]
机构
[1] Chicago Med Sch, Dept Med, N Chicago, IL USA
[2] Univ Arkansas Med Sci, Harvey & Bernice Jones Eye Inst, Dept Ophthalmol, Little Rock, AR USA
[3] Univ Arkansas Med Sci, Harvey & Bernice Jones Eye Inst, Benha, AR USA
[4] South Valley Univ, Fac Med, Dept Ophthalmol, Qena, Egypt
[5] Ain Shams Univ, Fac Med, Dept Ophthalmol, Cairo, Egypt
[6] Rutgers New Jersey Med Sch, Inst Ophthalmol & Visual Sci ASK, Newark, NJ USA
[7] Univ Miami, Bascom Palmer Eye Inst, Dept Ophthalmol, Miller Sch Med, Miami, FL USA
[8] Harvard Med Sch, Boston Childrens Hosp, Dept Ophthalmol, Boston, MA USA
关键词
FOLLOW-UP; READABILITY; INFORMATION; ADHERENCE; BARRIERS; QUALITY; CARE;
D O I
10.1016/j.ajo.2024.04.004
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Purpose: To evaluate the quality, readability, and accuracy of large language model (LLM)-generated patient education materials (PEMs) on childhood glaucoma, and their ability to improve existing the readability of online information. Design: Cross-sectional comparative study. Methods: We evaluated responses of ChatGPT-3.5, ChatGPT-4, and Bard to 3 separate prompts requesting that they write PEMs on "childhood glaucoma." Prompt A required PEMs be "easily understandable by the average American." Prompt B required that PEMs be written "at a 6th-grade level using Simple Measure of Gobbledygook (SMOG) readability formula." We then compared responses' quality (DISCERN questionnaire, Patient Education Materials Assessment Tool [PEMAT]), readability (SMOG, Flesch-Kincaid Grade Level [FKGL]), and accuracy (Likert Misinformation scale). To assess the improvement of readability for existing online information, Prompt C requested that LLM rewrite 20 resources from a Google search of keyword "childhood glaucoma" to the American Medical Association-recommended "6th-grade level." Rewrites were compared on key metrics such as readability, complex words (>= 3 syllables), and sentence count. Results: All 3 LLMs generated PEMs that were of high quality, understandability, and accuracy (DISCERN >= 4, >= 70% PEMAT understandability, Misinformation score = 1). Prompt B responses were more readable than Prompt A responses for all 3 LLM (P <= .001). ChatGPT-4 generated the most readable PEMs compared to ChatGPT-3.5 and Bard (P <= .001). Although Prompt C responses showed consistent reduction of mean SMOG and FKGL scores, only ChatGPT-4 achieved the specified 6th-grade reading level (4.8 +/- 0.8 and 3.7 +/- 1.9, respectively). Conclusions:<bold> </bold>LLMs can serve as strong supplemental tools in generating high-quality, accurate, and novel PEMs, and improving the readability of existing PEMs on childhood glaucoma.
引用
收藏
页码:28 / 38
页数:11
相关论文
共 50 条
  • [21] Assertify: Utilizing Large Language Models to Generate Assertions for Production Code
    Torkamani, Mohammad Jalili
    Sharma, Abhinav
    Mehrotra, Nikita
    Purandare, Rahul
    arXiv,
  • [22] Evaluating the Application of Large Language Models to Generate Feedback in Programming Education
    Jacobs, Sven
    Jaschke, Steffen
    2024 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE, EDUCON 2024, 2024,
  • [23] A Survey of Lay People's Willingness to Generate Legal Advice using Large Language Models (LLMs)
    Seabrooke, Tina
    Schneiders, Eike
    Dowthwaite, Liz
    Krook, Joshua
    Leesakul, Natalie
    Cios, Jeremie
    Maior, Horia
    Fischer, Joel
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON TRUSTWORTHY AUTONOMOUS SYSTEMS, TAS 2024, 2024,
  • [24] Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension
    Spina, Aidin C.
    Fereydouni, Pirooz
    Tang, Jordan N.
    Andalib, Saman
    Picton, Bryce G.
    Fox, Austin R.
    MEDICINE, 2025, 104 (02)
  • [25] Using large language models to generate silicon samples in consumer and marketing research: Challenges, opportunities, and guidelines
    Sarstedt, Marko
    Adler, Susanne J.
    Rau, Lea
    Schmitt, Bernd
    PSYCHOLOGY & MARKETING, 2024, 41 (06) : 1254 - 1270
  • [26] Using Artificial Intelligence to Generate Medical Literature for Patients: A Comparison of Three Different Large Language Models
    Pompili, D.
    Richa, Y.
    Collins, P.
    Hennessey, D. B.
    BRITISH JOURNAL OF SURGERY, 2024, 111
  • [27] Leveraging Large Language Models to Generate Clinical Histories for Oncologic Imaging Requisitions
    Bhayana, Rajesh
    Alwahbi, Omar
    Ladak, Aly Muhammad
    Deng, Yangqing
    Dias, Adriano Basso
    Elbanna, Khaled
    Gomez, Jorge Abreu
    Jajodia, Ankush
    Jhaveri, Kartik
    Johnson, Sarah
    Kajal, Dilkash
    Wang, David
    Soong, Christine
    Kielar, Ania
    Krishna, Satheesh
    RADIOLOGY, 2025, 314 (02)
  • [28] How Useful Are Educational Questions Generated by Large Language Models?
    Elkins, Sabina
    Kochmar, Ekaterina
    Serban, Iulian
    Cheung, Jackie C. K.
    ARTIFICIAL INTELLIGENCE IN EDUCATION. POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2023, 2023, 1831 : 536 - 542
  • [29] Large language models generate functional protein sequences across diverse families
    Ali Madani
    Ben Krause
    Eric R. Greene
    Subu Subramanian
    Benjamin P. Mohr
    James M. Holton
    Jose Luis Olmos
    Caiming Xiong
    Zachary Z. Sun
    Richard Socher
    James S. Fraser
    Nikhil Naik
    Nature Biotechnology, 2023, 41 : 1099 - 1106
  • [30] Instruct Large Language Models to Generate Scientific Literature Survey Step by Step
    Lai, Yuxuan
    Wu, Yupeng
    Wang, Yidan
    Hu, Wenpeng
    Zheng, Chen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 484 - 496