Performance of ChatGPT and GPT-4 on Polish National Specialty Exam (NSE) in Ophthalmology

被引:0
|
作者
Ciekalski, Marcin [1 ]
Laskowski, Maciej [1 ]
Koperczak, Agnieszka [1 ]
Smierciak, Maria [1 ]
Sirek, Sebastian [2 ]
机构
[1] Med Univ Silesia, Fac Med Sci Katowice, Student Sci Soc, Dept Ophthalmol, Katowice, Poland
[2] Med Univ Siles, Fac Med Sci Katowice, Dept Ophthalmol, Katowice, Poland
来源
关键词
ophthalmology; ChatGPT; Polish national specialty exam;
D O I
10.2478/ahem-2024-0006
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Introduction Artificial intelligence (AI) has evolved significantly, driven by advancements in computing power and big data. Technologies like machine learning and deep learning have led to sophisticated models such as GPT-3.5 and GPT-4. This study assesses the performance of these AI models on the Polish National Specialty Exam in ophthalmology, exploring their potential to support research, education, and clinical decision-making in healthcare.Materials and Methods The study analyzed 98 questions from the Spring 2023 Polish NSE in Ophthalmology. Questions were categorized into five groups: Physiology & Diagnostics, Clinical & Case Questions, Treatment & Pharmacology, Surgery, and Pediatrics. GPT-3.5 and GPT-4 were tested for their accuracy in answering these questions, with a confidence rating from 1 to 5 assigned to each response. Statistical analyses, including the Chi-squared test and Mann-Whitney U test, were employed to compare the models' performance.Results GPT-4 demonstrated a significant improvement over GPT-3.5, correctly answering 63.3% of questions compared to GPT-3.5's 37.8%. GPT-4's performance met the passing criteria for the NSE. The models showed varying degrees of accuracy across different categories, with a notable gap in fields like surgery and pediatrics.Conclusions The study highlights the potential of GPT models in aiding clinical decisions and educational purposes in ophthalmology. However, it also underscores the models' limitations, particularly in specialized fields like surgery and pediatrics. The findings suggest that while AI models like GPT-3.5 and GPT-4 can significantly assist in the medical field, they require further development and fine-tuning to address specific challenges in various medical domains.
引用
收藏
页码:111 / 116
页数:6
相关论文
共 50 条
  • [41] Is GPT-4 capable of passing MIR 2023? Comparison between GPT-4 and ChatGPT-3 in the MIR 2022 and 2023 exams
    Cerame, Alvaro
    Juaneda, Juan
    Estrella-Porter, Pablo
    de la Puente, Lucia
    Navarro, Joaquin
    Garcia, Eva
    Sanchez, Domingo A.
    Carrasco, Juan Pablo
    SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (02):
  • [42] The Performance of GPT-3.5, GPT-4, and Bard on the Japanese National Dentist Examination: A Comparison Study
    Ohta, Keiichi
    Ohta, Satomi
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (12)
  • [43] ChatGPT与GPT-4的科技新范式思考
    翟尤
    互联网天地, 2023, (04) : 28 - 33
  • [44] The potential impact of ChatGPT/GPT-4 on surgery: will it topple the profession of surgeons?
    Cheng, Kunming
    Sun, Zaijie
    He, Yongbin
    Gu, Shuqin
    Wu, Haiyang
    INTERNATIONAL JOURNAL OF SURGERY, 2023, 109 (05) : 1545 - 1547
  • [45] Assessing the medical reasoning skills of GPT-4 in complex ophthalmology cases
    Milad, Daniel
    Antaki, Fares
    Milad, Jason
    Farah, Andrew
    Khairy, Thomas
    Mikhail, David
    Giguere, Charles-Edouard
    Touma, Samir
    Bernstein, Allison
    Szigiato, Andrei-Alexandru
    Nayman, Taylor
    Mullie, Guillaume A.
    Duval, Renaud
    BRITISH JOURNAL OF OPHTHALMOLOGY, 2024, 108 (10) : 1398 - 1405
  • [46] Performance of GPT-4 on Chinese Nursing Examination
    Miao, Yiqun
    Luo, Yuan
    Zhao, Yuhan
    Li, Jiawei
    Liu, Mingxuan
    Wang, Huiying
    Chen, Yuling
    Wu, Ying
    NURSE EDUCATOR, 2024, 49 (06) : E338 - E343
  • [47] INTERVENTIONAL NEPHROLOGY ASSESSMENT QUESTIONS: A PERFORMANCE EVALUATION AND COMPARATIVE ANALYSIS OF CHATGPT-3.5 AND GPT-4
    Sheikh, Mohammad
    Qureshi, Fawad
    Thongprayoon, Charat
    Suarez, Lourdes Gonzalez
    Craici, Lasmina
    Cheungpasitporn, Visit
    AMERICAN JOURNAL OF KIDNEY DISEASES, 2024, 83 (04) : S100 - S101
  • [48] Große Sprachmodelle wie ChatGPT und GPT-4 für eine patientenzentrierte RadiologieLarge language models such as ChatGPT and GPT-4 for patient-centered care in radiology
    Matthias A. Fink
    Die Radiologie, 2023, 63 : 665 - 671
  • [49] GPT-4/4V's performance on the Japanese National Medical Licensing Examination
    Kawahara, Tomoki
    Sumi, Yuki
    MEDICAL TEACHER, 2025, 47 (03) : 450 - 457
  • [50] Artificial Intelligence in Intensive Care Medicine: Toward a ChatGPT/GPT-4 Way
    Lu, Yanqiu
    Wu, Haiyang
    Qi, Shaoyan
    Cheng, Kunming
    ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (09) : 1898 - 1903