How does ChatGPT perform on the European Board of Pediatric Surgery examination? A randomized comparative study

被引:2
|
作者
Azizoglu, Mustafa [1 ]
Aydogdu, Bahattin [2 ]
机构
[1] Dicle Univ, Med Sch, Dept Pediat Surg, Diyarbakir, Turkiye
[2] Balikesir Univ, Dept Pediat Surg, Balikesir, Turkiye
来源
MEDICINA BALEAR | 2024年 / 39卷 / 01期
关键词
ChatGPT; Pediatric Surgery; exam; questions; artificial intelligence;
D O I
10.3306/AJHS.2024.39.01.23
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Purpose: The purpose of this study was to conduct a detailed comparison of the accuracy and responsiveness of GPT-3.5 and GPT-4 in the realm of pediatric surgery. Specifically, we sought to assess their ability to correctly answer a series of sample questions of European Board of Pediatric Surgery (EBPS) exam. Methods: This study was conducted between 20 May 2023 and 30 May 2023. This study undertook a comparative analysis of two AI language models, GPT-3.5 and GPT-4, in the field of pediatric surgery, particularly in the context of EBPS exam sample questions. Two sets of 105 (total 210) sample questions each, derived from the EBPS sample questions, were collated. Results: In General Pediatric Surgery, GPT-3.5 provided correct answers for 7 questions (46.7%), and GPT-4 had a higher accuracy with 13 correct responses (86.7%) (p=0.020). For Newborn Surgery and Pediatric Urology, GPT-3.5 correctly answered 6 questions (40.0%), and GPT-4, however, correctly answered 12 questions (80.0%) (p= 0.025). In total, GPT-3.5 correctly answered 46 questions out of 105 (43.8%), and GPT-4 showed significantly better performance, correctly answering 80 questions (76.2%) (p<0.001). Given the total responses, when GPT-4 was compared with GPT-3.5, the Odds Ratio was found to be 4.1. This suggests that GPT-4 was 4.1 times more likely to provide a correct answer to the pediatric surgery questions compared to GPT-3.5. Conclusion: This comparative study concludes that GPT-4 significantly outperforms GPT-3.5 in responding to EBPS exam questions.
引用
收藏
页码:23 / 26
页数:4
相关论文
共 50 条
  • [41] How Does Medical Device Regulation Perform in the United States and the European Union? A Systematic Review
    Kramer, Daniel B.
    Xu, Shuai
    Kesselheim, Aaron S.
    PLOS MEDICINE, 2012, 9 (07)
  • [42] Fellowship of the European Board of Vascular Surgery: a pilot study of technical skills assessment in a high-stakes surgical examination
    Pandey, VA
    Wolfe, JHN
    Liapis, CD
    Bergqvist, D
    BRITISH JOURNAL OF SURGERY, 2005, 92 (04) : 503 - 503
  • [43] How to Prepare for the American Board of Surgery In-Training Examination (ABSITE): A Systematic Review
    Velez, David Ray
    Johnson, Stefan Walter
    Sticca, Robert Peter
    JOURNAL OF SURGICAL EDUCATION, 2022, 79 (01) : 216 - 228
  • [44] CORR Insights®: Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT
    Karnuta, Jaret McGraw
    CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2023, 481 (08) : 1631 - 1633
  • [45] Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study
    Wang, Ying-Mei
    Shen, Hung-Wei
    Chen, Tzeng-Ji
    Chiang, Shu-Chiung
    Lin, Ting-Guan
    JMIR MEDICAL EDUCATION, 2025, 11
  • [46] A study of internal structure validity for the American board of surgery in training examination
    Carter, Taylor M.
    Sun, Ting
    Jones, Andrew
    Smith, Brigitte K.
    AMERICAN JOURNAL OF SURGERY, 2025, 242
  • [47] Assessing ChatGPT-4's Capabilities in Generating Dermatology Board Examination Content: An Explorational Study
    Shapiro, Jonathan
    Lyakhovitsky, Anna
    Freud, Tamar
    Pavlotsky, Felix
    Khamaysi, Ziad
    Valdman-Grinshpoun, Yulia
    Dodiuk-Gad, Roni
    Goldberg, Ilan
    Ingber, Arieh
    Kaplan, Baruch
    Avitan-Hersh, Emily
    ACTA DERMATO-VENEREOLOGICA, 2025, 105
  • [48] Can AI pass the written European Board Examination in Neurological Surgery? - Ethical and practical issues
    Stengel, Felix C.
    Stienen, Martin N.
    Ivanov, Marcel
    Gandia-Gonz, Maria L.
    Raffa, Giovanni
    Ganau, Mario
    Whitfield, Peter
    Motov, Stefan
    BRAIN AND SPINE, 2024, 4
  • [49] Training in colorectal surgery in Europe and 20 years of the European Board of Surgical Qualification coloproctology examination
    Farinha, H. Teixeira
    Matzel, K. E.
    Nicholls, J.
    Hetzer, F.
    Zimmerman, D. D. E.
    Warusavitarne, J.
    Hahnloser, D.
    COLORECTAL DISEASE, 2020, 22 (07) : 831 - 838
  • [50] How does angiogenesis develop in pediatric moyamoya disease after surgery? A prospective study with MR angiography
    Houkin, K
    Nakayama, N
    Kuroda, S
    Ishikawa, T
    Nonaka, T
    CHILDS NERVOUS SYSTEM, 2004, 20 (10) : 734 - 741