EyeGPT for Patient Inquiries and Medical Education: Development and Validation of an Ophthalmology Large Language Model

被引:0
|
作者
Chen, Xiaolan [1 ]
Zhao, Ziwei [1 ]
Zhang, Weiyi [1 ]
Xu, Pusheng [1 ]
Wu, Yue [1 ]
Xu, Mingpu [1 ]
Gao, Le [2 ]
Li, Yinwen [3 ,4 ]
Shang, Xianwen [1 ]
Shi, Danli [1 ,5 ]
He, Mingguang [1 ,5 ,6 ]
机构
[1] Hong Kong Polytech Univ, Sch Optometry, Hong Kong, Peoples R China
[2] Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, Guangdong Prov Clin Res Ctr Ocular Dis, State Key Lab Ophthalmol, Guangzhou, Peoples R China
[3] Shangh ai Jiao Tong Univ, Shanghai Gen Hosp, Shanghai Peoples Hosp 1, Sch Med,Dept Ophthamol, Shanghai, Peoples R China
[4] Natl Clin Res Ctr Eye Dis, Shanghai, Peoples R China
[5] Hong Kong Polytech Univ, Res Ctr SHARP Vis RCSV, Hong Kong, Peoples R China
[6] Ctr Eye & Vis Res CEVR, 17W Hong Kong Sci Pk, Hong Kong, Peoples R China
关键词
large language model; generative pretrained transformer; generative artificial intelligence; ophthalmology; retrieval-augmented generation; medical assistant; EyeGPT; generative AI;
D O I
10.2196/60063
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Large language models (LLMs) have the potentialto enhance clinical flow and improve medical education, but they encounter challenges related to specialized knowledge in ophthalmology. Objective: This study aims to enhance ophthalmic knowledge by refining a general LLM into an ophthalmology-specialized assistant for patient inquiries and medical education. Methods: Wetransformed Llama2 into an ophthalmology-specialized LLM, termed EyeGPT, through the following 3 strategies: prompt engineering for role-playing, fine-tuning with publicly available data sets filtered for eye-specific terminology (83,919 samples), and retrieval-augmented generation leveraging a medical database and 14 ophthalmology textbooks. The efficacy of various EyeGPT variants was evaluated by 4 board-certified ophthalmologists through comprehensive use of 120 diversecategory questions in both simple and complex question-answering scenarios. The performance of the best EyeGPT model was then compared with that of the unassisted human physician group and the EyeGPT+human group. We proposed 4 metrics for assessment: accuracy, understandability, trustworthiness, and empathy. The proportion of hallucinations was also reported. Results: The best fine-tuned model significantly outperformed the original Llama2 model at providing informed advice (mean 9.30, SD 4.42 vs mean 13.79, SD 5.70; P <.001) and mitigating hallucinations (97/120, 80.8% vs 53/120, 44.2%, P <.001). Incorporating information retrieval from reliable sources, particularly ophthalmology textbooks, further improved the model's response compared with solely the best fine-tuned model (mean 13.08, SD 5.43 vs mean 15.14, SD 4.64; P =.001) and reduced hallucinations (71/120, 59.2% vs 57/120, 47.4%, P =.02). Subgroup analysis revealed that EyeGPT showed robustness across common diseases, with consistent performance across different users and domains. Among the variants, the model integrating fine-tuning and book retrieval ranked highest, closely followed by the combination of fine-tuning and the manual database, standalone fine-tuning, and pure role-playing methods. EyeGPT demonstrated competitive capabilities in understandability and empathy when compared with human ophthalmologists. With the assistance of EyeGPT, the performance of the ophthalmologist was notably enhanced. Conclusions: We pioneered and introduced EyeGPT by refining a general domain LLM and conducted a comprehensive comparison and evaluation of different strategies to develop an ophthalmology-specific assistant. Our results highlight EyeGPT's potentialto assist ophthalmologists and patients in medical settings.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Development and evaluation of a large language model of ophthalmology in Chinese
    Zheng, Ce
    Ye, Hongfei
    Guo, Jinming
    Yang, Junrui
    Fei, Ping
    Yuan, Yuanzhi
    Huang, Danqing
    Huang, Yuqiang
    Peng, Jie
    Xie, Xiaoling
    Xie, Meng
    Zhao, Peiquan
    Chen, Li
    Zhang, Mingzhi
    BRITISH JOURNAL OF OPHTHALMOLOGY, 2024,
  • [2] Medical education with large language models in ophthalmology: custom instructions and enhanced retrieval capabilities
    Sevgi, Mertcan
    Antaki, Fares
    Keane, Pearse A.
    BRITISH JOURNAL OF OPHTHALMOLOGY, 2024, 108 (10) : 1354 - 1361
  • [3] Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology
    Luo, Ming-Jie
    Pang, Jianyu
    Bi, Shaowei
    Lai, Yunxi
    Zhao, Jiaman
    Shang, Yuanrui
    Cui, Tingxin
    Yang, Yahan
    Lin, Zhenzhe
    Zhao, Lanqin
    Wu, Xiaohang
    Lin, Duoru
    Chen, Jingjing
    Lin, Haotian
    JAMA OPHTHALMOLOGY, 2024, 142 (09) : 798 - 805
  • [4] Development and evaluation of a mobile application for medical student ophthalmology education
    Kumar, Soryan
    Lokhande, Anagha
    Jarmale, Spandana
    Kumar, Arnav
    Rosenthal, Samantha
    Armstrong, Grayson Wilkes
    Migliori, Michael E.
    Schaefer, Jamie
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [5] Development of a standardized Canadian undergraduate medical education ophthalmology curriculum
    Paco, Charles
    Pucchio, Aidan
    Nathoo, Nawaaz
    Mishra, Anuradha
    Damji, Karim F.
    Law, Christine
    CANADIAN JOURNAL OF OPHTHALMOLOGY-JOURNAL CANADIEN D OPHTALMOLOGIE, 2024, 59 (02): : e130 - e134
  • [6] Large Language Models and Their Implications on Medical Education
    Bair, Henry
    Norden, Justin
    ACADEMIC MEDICINE, 2023, 98 (08) : 869 - 870
  • [7] From assistance to reliance: Development and validation of the large language model dependence scale
    Li, Zewei
    Zhang, Zheng
    Wang, Mingwei
    Wu, Qi
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2025, 83
  • [8] Exploring large language model for next generation of artificial intelligence in ophthalmology
    Jin, Kai
    Yuan, Lu
    Wu, Hongkang
    Grzybowski, Andrzej
    Ye, Juan
    FRONTIERS IN MEDICINE, 2023, 10
  • [9] Large language model (LLM)-driven chatbots for neuro-ophthalmic medical education
    Waisberg, Ethan
    Ong, Joshua
    Masalkhi, Mouayad
    Lee, Andrew G.
    EYE, 2024, 38 (04) : 639 - 641
  • [10] Large language model (LLM)-driven chatbots for neuro-ophthalmic medical education
    Ethan Waisberg
    Joshua Ong
    Mouayad Masalkhi
    Andrew G. Lee
    Eye, 2024, 38 : 639 - 641