Assessing ChatGPT4 with and without retrieval-augmented generation in anticoagulation management for gastrointestinal procedures

被引:1
|
作者
Malik, Sheza [1 ]
Kharel, Himal [1 ]
Dahiya, Dushyant S. [2 ]
Ali, Hassam [3 ]
Blaney, Hanna [4 ]
Singh, Achintya [5 ]
Dhar, Jahnvi [6 ]
Perisetti, Abhilash [7 ]
Facciorusso, Antonio [7 ]
Chandan, Saurabh [8 ]
Mohan, Babu P. [9 ]
机构
[1] Rochester Gen Hosp, Portland Ave, Rochester, NY 14621 USA
[2] Univ Kansas, Sch Med, Kansas City, KS USA
[3] East Carolina Univ, Greenville, NC USA
[4] NYU, Grossman Sch Med, New York, NY USA
[5] Metro Hlth, Cleveland, OH USA
[6] Postgrad Inst Med Educ & Res, Chandigarh, India
[7] Univ Foggia, Foggia, Italy
[8] Creighton Univ, Med Ctr, Omaha, NE USA
[9] Orlando Gastroenterol, Orlando, FL USA
来源
ANNALS OF GASTROENTEROLOGY | 2024年 / 37卷 / 05期
关键词
Anticoagulation management; gastrointestinal procedures; accuracy; ChatGPT4-; RAG; endoscopic procedures;
D O I
10.20524/aog.2024.0907
中图分类号
R57 [消化系及腹部疾病];
学科分类号
摘要
Background In view of the growing complexity of managing anticoagulation for patients undergoing gastrointestinal (GI) procedures, this study evaluated ChatGPT-4's ability to provide accurate medical guidance, comparing it with its prior artificial intelligence (AI) models (ChatGPT-3.5) and the retrieval-augmented generation (RAG)-supported model (ChatGPT4-RAG). Methods Thirty-six anticoagulation-related questions, based on professional guidelines, were answered by ChatGPT-4. Nine gastroenterologists assessed these responses for accuracy and relevance. ChatGPT-4's performance was also compared to that of ChatGPT-3.5 and ChatGPT4-RAG. Additionally, a survey was conducted to understand gastroenterologists' perceptions of ChatGPT-4. Results ChatGPT-4's responses showed significantly better accuracy and coherence compared to ChatGPT-3.5, with 30.5% of responses fully accurate and 47.2% generally accurate. ChatGPT4RAG demonstrated a higher ability to integrate current information, achieving 75% full accuracy. Notably, for diagnostic and therapeutic esophagogastroduodenoscopy, 51.8% of responses were fully accurate; for endoscopic retrograde cholangiopancreatography with and without stent placement, 42.8% were fully accurate; and for diagnostic and therapeutic colonoscopy, 50% were fully accurate. Conclusions ChatGPT4-RAG significantly advances anticoagulation management in endoscopic procedures, offering reliable and precise medical guidance. However, medicolegal considerations mean that a 75% full accuracy rate remains inadequate for independent clinical decision-making. AI may be more appropriately utilized to support and confirm clinicians' decisions, rather than replace them. Further evaluation is essential to maintain patient confidentiality and the integrity of the physician-patient relationship.
引用
收藏
页码:514 / 526
页数:13
相关论文
共 8 条
  • [1] GastroBot: a Chinese gastrointestinal disease chatbot based on the retrieval-augmented generation
    Zhou, Qingqing
    Liu, Can
    Duan, Yuchen
    Sun, Kaijie
    Li, Yu
    Kan, Hongxing
    Gu, Zongyun
    Shu, Jianhua
    Hu, Jili
    FRONTIERS IN MEDICINE, 2024, 11
  • [2] Improving knowledge management in building engineering with hybrid retrieval-augmented generation framework
    Wang, Zhiqi
    Liu, Zhongcun
    Lu, Weizhen
    Jia, Lu
    JOURNAL OF BUILDING ENGINEERING, 2025, 103
  • [3] Application of retrieval-augmented generation for interactive industrial knowledge management via a large language model
    Chen, Lun-Chi
    Pardeshi, Mayuresh Sunil
    Liao, Yi-Xiang
    Pai, Kai-Chih
    COMPUTER STANDARDS & INTERFACES, 2025, 94
  • [4] Leveraging GPT-4 for Accuracy in Education: A Comparative Study on Retrieval-Augmented Generation in MOOCs
    Miladi, Fatma
    Psyche, Valery
    Lemire, Daniel
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, PT I, 2024, 2150 : 427 - 434
  • [5] Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval
    Lee, Jungwon
    Ahn, Seungjun
    Kim, Daeho
    Kim, Dongkyun
    AUTOMATION IN CONSTRUCTION, 2024, 168
  • [6] Using the Retrieval-Augmented Generation Technique to Improve the Performance of GPT-4 in Answering Quran Questions
    Alnefaie, Sarah
    Atwell, Eric
    Alsalka, Mohammed Ammar
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 377 - 381
  • [7] A retrieval-augmented chatbot based on GPT-4 provides appropriate differential diagnosis in gastrointestinal radiology: a proof of concept study
    Rau, Stephan
    Rau, Alexander
    Nattenmueller, Johanna
    Fink, Anna
    Bamberg, Fabian
    Reisert, Marco
    Russe, Maximilian F.
    EUROPEAN RADIOLOGY EXPERIMENTAL, 2024, 8 (01)
  • [8] Retrieval-augmented generation improves precision and trust of a GPT-4 model for emergency radiology diagnosis and classification: a proof-of-concept study
    Fink, Anna
    Nattenmueller, Johanna
    Rau, Stephan
    Rau, Alexander
    Tran, Hien
    Bamberg, Fabian
    Reisert, Marco
    Kotter, Elmar
    Diallo, Thierno
    Russe, Maximilian F.
    EUROPEAN RADIOLOGY, 2025,