Chat Generative Pretrained Transformer (ChatGPT) and Bard: Artificial Intelligence Does not yet Provide Clinically Supported Answers for Hip and Knee Osteoarthritis

被引:9
|
作者
Yang, Jaewon [1 ,5 ]
Ardavanis, Kyle S. [2 ]
Slack, Katherine E. [3 ]
Fernando, Navin D. [1 ]
Della Valle, Craig J. [4 ]
Hernandez, Nicholas M. [1 ]
机构
[1] Univ Washington, Dept Orthopaed Surg, Seattle, WA USA
[2] Madigan Army Med Ctr, Dept Orthopaed Surg, Tacoma, WA 98431 USA
[3] Washington State Univ, Elson S Floyd Coll Med, Spokane, WA USA
[4] Rush Univ, Dept Orthopaed Surg, Med Ctr, Chicago, IL USA
[5] Univ Washington, Dept Orthopaed & Sports Med, Seattle, WA 98104 USA
来源
JOURNAL OF ARTHROPLASTY | 2024年 / 39卷 / 05期
关键词
ChatGPT; bard; machine learning; artificial intelligence; large language models; LEARNING ALGORITHM; COMPLICATIONS; ARTHROPLASTY;
D O I
10.1016/j.arth.2024.01.029
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Background: Advancements in artificial intelligence (AI) have led to the creation of large language models (LLMs), such as Chat Generative Pretrained Transformer (ChatGPT) and Bard, that analyze online resources to synthesize responses to user queries. Despite their popularity, the accuracy of LLM responses to medical questions remains unknown. This study aimed to compare the responses of ChatGPT and Bard regarding treatments for hip and knee osteoarthritis with the American Academy of Orthopaedic Surgeons (AAOS) Evidence-Based Clinical Practice Guidelines (CPGs) recommendations. Methods: Both ChatGPT (Open AI) and Bard (Google) were queried regarding 20 treatments (10 for hip and 10 for knee osteoarthritis) from the AAOS CPGs. Responses were classified by 2 reviewers as being in "Concordance, " "Discordance, " or "No Concordance " with AAOS CPGs. A Cohen 's Kappa coefficient was used to assess inter -rater reliability, and Chi -squared analyses were used to compare responses between LLMs. Results: Overall, ChatGPT and Bard provided responses that were concordant with the AAOS CPGs for 16 (80%) and 12 (60%) treatments, respectively. Notably, ChatGPT and Bard encouraged the use of nonrecommended treatments in 30% and 60% of queries, respectively. There were no differences in performance when evaluating by joint or by recommended versus non-recommended treatments. Studies were referenced in 6 (30%) of the Bard responses and none (0%) of the ChatGPT responses. Of the 6 Bard responses, studies could only be identified for 1 (16.7%). Of the remaining, 2 (33.3%) responses cited studies in journals that did not exist, 2 (33.3%) cited studies that could not be found with the information given, and 1 (16.7%) provided links to unrelated studies. Conclusions: Both ChatGPT and Bard do not consistently provide responses that align with the AAOS CPGs. Consequently, physicians and patients should temper expectations on the guidance AI platforms can currently provide. (c) 2024 Elsevier Inc. All rights reserved.
引用
收藏
页码:1184 / 1190
页数:7
相关论文
共 6 条
  • [1] Utilizing Artificial Intelligence and Chat Generative Pretrained Transformer to Answer Questions About Clinical Scenarios in Neuroanesthesiology
    Blacker, Samuel N.
    Kang, Mia
    Chakraborty, Indranil
    Chowdhury, Tumul
    Williams, James
    Lewis, Carol
    Zimmer, Michael
    Wilson, Brad
    Lele, Abhijit V.
    [J]. JOURNAL OF NEUROSURGICAL ANESTHESIOLOGY, 2024, 36 (04) : 346 - 351
  • [2] Artificial intelligence tools in medical education beyond Chat Generative Pre-trained Transformer (ChatGPT)
    Tan, Li Feng
    Ng, Isaac K. S.
    Teo, Desmond
    [J]. POSTGRADUATE MEDICAL JOURNAL, 2024, 100 (1187) : 697 - 698
  • [3] Foot and Ankle Surgery declares use of generative artificial intelligence like Chat Generative Pre-trained Transformer (ChatGPT) for scientific publications
    Richter, Martinus
    [J]. FOOT AND ANKLE SURGERY, 2023, 29 (05) : 385 - 386
  • [4] Blepharoptosis Consultation with Artificial Intelligence: Aesthetic Surgery Advice and Counseling from Chat Generative Pre-Trained Transformer (ChatGPT)
    Shiraishi, Makoto
    Tanigawa, Koji
    Tomioka, Yoko
    Miyakuni, Ami
    Moriwaki, Yuta
    Yang, Rui
    Oba, Jun
    Okazaki, Mutsumi
    [J]. AESTHETIC PLASTIC SURGERY, 2024, 48 (11) : 2057 - 2063
  • [5] Letter to the Editor: Educating Patients With Advanced Heart Failure Through Chat Generative Pretrained Transformer and Natural-Language Artificial Intelligence: Is Now the Time for It?
    Koh, Samuel Ji Quan
    Sim, David Kheng Leng
    Neo, Shirlyn Hui-Shan
    [J]. JOURNAL OF PALLIATIVE MEDICINE, 2023, 26 (07) : 893 - 895
  • [6] Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?
    Cuthbert, Rory
    Simpson, Ashley, I
    [J]. POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1176) : 1110 - 1114