How does ChatGPT perform on the European Board of Pediatric Surgery examination? A randomized comparative study

被引:2
|
作者
Azizoglu, Mustafa [1 ]
Aydogdu, Bahattin [2 ]
机构
[1] Dicle Univ, Med Sch, Dept Pediat Surg, Diyarbakir, Turkiye
[2] Balikesir Univ, Dept Pediat Surg, Balikesir, Turkiye
来源
MEDICINA BALEAR | 2024年 / 39卷 / 01期
关键词
ChatGPT; Pediatric Surgery; exam; questions; artificial intelligence;
D O I
10.3306/AJHS.2024.39.01.23
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Purpose: The purpose of this study was to conduct a detailed comparison of the accuracy and responsiveness of GPT-3.5 and GPT-4 in the realm of pediatric surgery. Specifically, we sought to assess their ability to correctly answer a series of sample questions of European Board of Pediatric Surgery (EBPS) exam. Methods: This study was conducted between 20 May 2023 and 30 May 2023. This study undertook a comparative analysis of two AI language models, GPT-3.5 and GPT-4, in the field of pediatric surgery, particularly in the context of EBPS exam sample questions. Two sets of 105 (total 210) sample questions each, derived from the EBPS sample questions, were collated. Results: In General Pediatric Surgery, GPT-3.5 provided correct answers for 7 questions (46.7%), and GPT-4 had a higher accuracy with 13 correct responses (86.7%) (p=0.020). For Newborn Surgery and Pediatric Urology, GPT-3.5 correctly answered 6 questions (40.0%), and GPT-4, however, correctly answered 12 questions (80.0%) (p= 0.025). In total, GPT-3.5 correctly answered 46 questions out of 105 (43.8%), and GPT-4 showed significantly better performance, correctly answering 80 questions (76.2%) (p<0.001). Given the total responses, when GPT-4 was compared with GPT-3.5, the Odds Ratio was found to be 4.1. This suggests that GPT-4 was 4.1 times more likely to provide a correct answer to the pediatric surgery questions compared to GPT-3.5. Conclusion: This comparative study concludes that GPT-4 significantly outperforms GPT-3.5 in responding to EBPS exam questions.
引用
收藏
页码:23 / 26
页数:4
相关论文
共 50 条
  • [1] How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study
    Ulus, Sait Anil
    MEDICINA BALEAR, 2023, 38 (06):
  • [2] ChatGPT and the European Board of Hand Surgery diploma examination: Correspondence
    Kleebayoon, Amnuay
    Mungmunpuntipantip, Rujittika
    Wiwanitkit, Viroj
    HAND SURGERY & REHABILITATION, 2023, 42 (05):
  • [3] Is ChatGPT able to pass the first part of the European Board of Hand Surgery diploma examination?
    Traore, Sidi Yaya
    Goetsch, Thibaut
    Muller, Benjamin
    Dabbagh, Armaghan
    Liverneaux, Philippe Andre
    HAND SURGERY & REHABILITATION, 2023, 42 (04): : 362 - 364
  • [4] The first board examination in pediatric surgery
    Hopkins, James W.
    Hopkins, Nancy J.
    Nakayama, Don K.
    JOURNAL OF PEDIATRIC SURGERY, 2022, 57 (01) : 168 - 171
  • [5] The European Board of Hand Surgery Examination
    Calcagni, M.
    JOURNAL OF HAND SURGERY-EUROPEAN VOLUME, 2013, 38 (06) : 692 - 695
  • [6] Does Google’s Bard Chatbot perform better than ChatGPT on the European hand surgery exam?
    Goetsch Thibaut
    Armaghan Dabbagh
    Philippe Liverneaux
    International Orthopaedics, 2024, 48 : 151 - 158
  • [7] Does Google's Bard Chatbot perform better than ChatGPT on the European hand surgery exam?
    Thibaut, Goetsch
    Dabbagh, Armaghan
    Liverneaux, Philippe
    INTERNATIONAL ORTHOPAEDICS, 2023, 48 (1) : 151 - 158
  • [8] Chatting about ChatGPT: How does ChatGPT 4.0 perform on the understanding and design of cementitious composite?
    Cai, Jingming
    Yuan, Yujin
    Sui, Xupeng
    Lin, Yuanzheng
    Zhuang, Ke
    Xu, Yun
    Zhang, Qian
    Ukrainczyk, Neven
    Xie, Tianyu
    CONSTRUCTION AND BUILDING MATERIALS, 2024, 425
  • [9] Evaluation of ChatGPT's Performance in the Turkish Board of Orthopaedic Surgery Examination
    Yigitbay, Ahmet
    HASEKI TIP BULTENI-MEDICAL BULLETIN OF HASEKI, 2024, 62 (04): : 243 - 249
  • [10] FEBVS (Fellow of the European Board of Vascular Surgery) European Examination in Vascular Surgery
    Mansilha, A.
    Scott, D. A. J.
    McLain, D.
    GEFASSCHIRURGIE, 2014, 19 (02): : 153 - 157