ChatGPT vs UpToDate: comparative study of usefulness and reliability of Chatbot in common clinical presentations of otorhinolaryngology-head and neck surgery

被引:10
|
作者
Karimov, Ziya [1 ]
Allahverdiyev, Irshad [2 ]
Agayarov, Ozlem Yagiz [3 ]
Demir, Dogukan [3 ]
Almuradova, Elvina [4 ,5 ]
机构
[1] Ege Univ, Med Program, Fac Med, TR-35100 Izmir, Turkiye
[2] Istanbul Univ, Istanbul Fac Med, Program Med, Istanbul, Turkiye
[3] Hlth Sci Univ, Izmir Tepecik Educ & Res Hosp, Dept Otolaryngol Head & Neck Surg, Izmir, Turkiye
[4] Ege Univ, Fac Med, Dept Med Oncol, Izmir, Turkiye
[5] Medicana Int Hosp, Dept Oncol, Izmir, Turkiye
关键词
Artificial intelligence; Chatbot; ChatGPT; ENT; UpToDate; Otorhinolaryngology and head and neck surgery; EPIDEMIOLOGY; AGREEMENT;
D O I
10.1007/s00405-023-08423-w
中图分类号
R76 [耳鼻咽喉科学];
学科分类号
100213 ;
摘要
Purpose The usage of Chatbots as a kind of Artificial Intelligence in medicine is getting to increase in recent years. UpToDate (R) is another well-known search tool established on evidence-based knowledge and is used daily by doctors worldwide. In this study, we aimed to investigate the usefulness and reliability of ChatGPT compared to UpToDate in Otorhinolaryngology and Head and Neck Surgery (ORL-HNS).Materials and methods ChatGPT-3.5 and UpToDate were interrogated for the management of 25 common clinical case scenarios (13 males/12 females) recruited from literature considering the daily observation at the Department of Otorhinolaryngology of Ege University Faculty of Medicine. Scientific references for the management were requested for each clinical case. The accuracy of the references in the ChatGPT answers was assessed on a 0-2 scale and the usefulness of the ChatGPT and UpToDate answers was assessed with 1-3 scores by reviewers. UpToDate and ChatGPT 3.5 responses were compared.Results ChatGPT did not give references in some questions in contrast to UpToDate. Information on the ChatGPT was limited to 2021. UpToDate supported the paper with subheadings, tables, figures, and algorithms. The mean accuracy score of references in ChatGPT answers was 0.25-weak/unrelated. The median (Q1-Q3) was 1.00 (1.25-2.00) for ChatGPT and 2.63 (2.75-3.00) for UpToDate, the difference was statistically significant (p < 0.001). UpToDate was observed more useful and reliable than ChatGPT.Conclusions ChatGPT has the potential to support the physicians to find out the information but our results suggest that ChatGPT needs to be improved to increase the usefulness and reliability of medical evidence-based knowledge.
引用
收藏
页码:2145 / 2151
页数:7
相关论文
共 9 条
  • [1] ChatGPT vs UpToDate: comparative study of usefulness and reliability of Chatbot in common clinical presentations of otorhinolaryngology–head and neck surgery
    Ziya Karimov
    Irshad Allahverdiyev
    Ozlem Yagiz Agayarov
    Dogukan Demir
    Elvina Almuradova
    European Archives of Oto-Rhino-Laryngology, 2024, 281 : 2145 - 2151
  • [2] Gene expression: a review of clinical applications in otorhinolaryngology-head and neck surgery
    Vats, A
    Tolley, NS
    Polak, JM
    Knight, BC
    CLINICAL OTOLARYNGOLOGY, 2002, 27 (05) : 291 - 295
  • [3] Reliability of a Tailored Tele-Medicine Model for Otorhinolaryngology-Head and Neck Surgery (ORL-HNS) Clinical Practice in Remote Areas of Nepal: A Pilot Study from Tertiary Referral Center
    Gyawali, Bigyan Raj
    Ghimire, Saurav
    Kashyap, Ashutosh
    Rai, Umesh Jang
    Tripathi, Prashant
    Acharya, Kunjan
    ENT-EAR NOSE & THROAT JOURNAL, 2024,
  • [4] Factors Influencing the Preference of Medical Students at Umm Al-Dura University for Otorhinolaryngology-Head and Neck Surgery as a Future Specialty: A Cross-Sectional Study
    Alharthi, Saad M.
    Al-Kaabi, Bader
    Alnajjar, Shaimaa K.
    Shosho, Raghad Y.
    Alkhamesi, Ameera A.
    Kabli, Abdulrahman F.
    Alzahrani, Ahmed
    Serhan, Lina F.
    Shatla, Mokhtar
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (04)
  • [5] Assessing the accuracy, usefulness, and readability of artificialintelligence- generated responses to common dermatologic surgery questions for patient education: A double-blinded comparative study of ChatGPT and Google Bard
    Robinson, Michelle A.
    Belzberg, Micah
    Thakker, Sach
    Bibee, Kristin
    Merkel, Emily
    Macfarlane, Deborah F.
    Lim, Jordan
    Scott, Jeffrey F.
    Deng, Min
    Lewin, Jesse
    Soleymani, David
    Rosenfeld, David
    Liu, Rosemarie
    Liu, Tin Yan Alvin
    Ng, Elise
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2024, 90 (05) : 1078 - 1080
  • [6] Comparative treatment planning study on sequential vs. simultaneous integrated boost in head and neck cancer patients Differences in dose distributions and potential implications for clinical practice
    Stromberger, Carmen
    Ghadjar, Pirus
    Marnitz, Simone
    Thieme, Alexander Henry
    Jahn, Ulrich
    Raguse, Jan D.
    Karaj-Rossbacher, Evis
    Boettcher, Arne
    Jamil, Basil
    Budach, Volker
    STRAHLENTHERAPIE UND ONKOLOGIE, 2016, 192 (01) : 17 - 24
  • [7] Study protocol of the TEC-ORL clinical trial: a randomized comparative phase II trial investigating the analgesic activity of capsaicin vs Laroxyl in head and neck Cancer survivors presenting with neuropathic pain sequelae
    Antoine Boden
    Amélie Lusque
    Sabrina Lodin
    Marie Bourgouin
    Valérie Mauries
    Christelle Moreau
    Amandine Fabre
    Muriel Mounier
    Muriel Poublanc
    Nathalie Caunes-Hilary
    Thomas Filleron
    BMC Cancer, 22
  • [8] Study protocol of the TEC-ORL clinical trial: a randomized comparative phase II trial investigating the analgesic activity of capsaicin vs Laroxyl in head and neck Cancer survivors presenting with neuropathic pain sequelae
    Boden, Antoine
    Lusque, Amelie
    Lodin, Sabrina
    Bourgouin, Marie
    Mauries, Valerie
    Moreau, Christelle
    Fabre, Amandine
    Mounier, Muriel
    Poublanc, Muriel
    Caunes-Hilary, Nathalie
    Filleron, Thomas
    BMC CANCER, 2022, 22 (01)
  • [9] Comparative treatment planning study on sequential vs. simultaneous integrated boost in head and neck cancer patients: Differences in dose distributions and potential implications for clinical practice; [Vergleichende Therapieplanungsstudie zum sequentiell oder simultan integrierten Boost bei Patienten mit Kopf-Hals-Tumoren: Unterschiede in der Dosisverteilung und potentielle Implikationen für die klinische Praxis]
    Stromberger C.
    Ghadjar P.
    Marnitz S.
    Thieme A.H.
    Jahn U.
    Raguse J.D.
    Karaj-Rossbacher E.
    Böttcher A.
    Jamil B.
    Budach V.
    Strahlentherapie und Onkologie, 2016, 192 (1) : 17 - 24