Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports

被引:2
|
作者
Young, Cameron C. [1 ,2 ]
Enichen, Ellie [1 ,2 ]
Rivera, Christian [1 ,2 ]
Auger, Corinne A. [1 ,2 ]
Grant, Nathan [1 ,2 ]
Rao, Arya [1 ,2 ]
Succi, Marc D. [2 ,3 ]
机构
[1] Harvard Med Sch, Boston, MA USA
[2] Mass Gen Brigham, Innovat Operat Res Ctr, Medically Engn Solut Healthcare Incubator, Boston, MA 02199 USA
[3] Massachusetts Gen Hosp, Dept Radiol, Boston, MA 02114 USA
关键词
artificial intelligence; diagnostic support; genetics; large language models; pediatric rare disease;
D O I
10.1002/ajmg.a.63878
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT-4, Gemini Pro, and a custom-built LLM (GPT-4 integrated with the Human Phenotype Ontology [GPT-4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT-4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT-4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT-4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT-4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain-specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] The Triage and Diagnostic Accuracy of Frontier Large Language Models: Updated Comparison to Physician Performance
    Sorich, Michael Joseph
    Mangoni, Arduino Aleksander
    Bacchi, Stephen
    Menz, Bradley Douglas
    Hopkins, Ashley Mark
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [42] CATEGORIZING FINDINGS FROM COLONOSCOPY REPORTS OF PATIENTS WITH INFLAMMATORY BOWEL DISEASE USING A GENERATIVE LARGE LANGUAGE MODEL
    Hong, Soonwook
    Zheng, Henry W.
    Greb, Alexandra C.
    Sharma, Vikram
    Limketkai, Berkeley
    GASTROENTEROLOGY, 2024, 166 (05) : S1496 - S1497
  • [43] AGE DEPENDENT PARAMETERS AND THE DIAGNOSTIC ACCURACY OF THE SCORING SYSTEM IN PEDIATRIC WILSON DISEASE
    Cho, S.
    Ko, J.
    Seo, J.
    JOURNAL OF PEDIATRIC GASTROENTEROLOGY AND NUTRITION, 2010, 50 : E147 - E147
  • [44] Disease specific diagnostic accuracy of new serological tests in pediatric IBD.
    Seidman, EG
    Ruemmele, FM
    Landers, C
    Gaiennie, J
    Braun, J
    Targan, SR
    GASTROENTEROLOGY, 1997, 112 (04) : A1087 - A1087
  • [45] Diagnostic accuracy of gray scale muscle ultrasound screening for pediatric neuromuscular disease
    Boon, Andrea J.
    Wijntjes, Juerd
    O'Brien, Travis G.
    Sorenson, Eric J.
    Cazares Gonzalez, Meaghan L.
    van Alfen, Nens
    MUSCLE & NERVE, 2021, 64 (01) : 50 - 58
  • [46] Enhancement of the diagnostic accuracy of large skin excision pathology reports by adding gross specimen photographs
    Hennessy, Jeannie
    Clarke, Loren E.
    Ioffreda, Michael D.
    Helm, Klaus F.
    JOURNAL OF CUTANEOUS PATHOLOGY, 2009, 36 (06) : 711 - 712
  • [47] Diagnostic Accuracy of Tryptase Levels for Pediatric Anaphylaxis: A Case-Control Study
    Khalaf, Roy
    Prosty, Connor
    McCusker, Christine
    Bretholz, Adam
    Kaouache, Mohammed
    Clarke, Ann E.
    Ben-Shoshan, Moshe
    INTERNATIONAL ARCHIVES OF ALLERGY AND IMMUNOLOGY, 2024,
  • [48] Enhanced Diagnostic Accuracy for Dental Caries and Anomalies in Panoramic Radiographs Using a Custom Deep Learning Model
    Bhat, Suvarna
    Birajdar, Gajanan
    Patil, Mukesh
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (08)
  • [49] Evaluating Diagnostic Accuracy and Treatment Efficacy in Mental Health: A Comparative Analysis of Large Language Model Tools and Mental Health Professionals
    Levkovich, Inbar
    EUROPEAN JOURNAL OF INVESTIGATION IN HEALTH PSYCHOLOGY AND EDUCATION, 2025, 15 (01)
  • [50] Adrenal Schwannoma: Case Description and Diagnostic Pointers of a Rare Disease
    Ziauddin Sr, Shiraz A. Mohd
    Sharma, Aditya P.
    Devana, Sudheer K.
    Vaiphei, Kim
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (02)