Performance evaluation of large language models with chain-of-thought reasoning ability in clinical laboratory case interpretation

被引:0
|
作者
Yang, He S. [1 ]
Li, Jieli [2 ]
Yi, Xin [1 ,3 ]
Wang, Fei [4 ]
机构
[1] Weill Cornell Med, Dept Pathol & Lab Med, 525 E 68th St,F707, New York, NY 10065 USA
[2] Ohio State Univ, Wexner Med Ctr, Dept Pathol, Columbus, OH USA
[3] Houston Methodist Hosp, Dept Pathol & Genom Med, Houston, TX USA
[4] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY USA
关键词
large language models; chain-of-thought; retrieval augmented generation; AI Chatbot; laboratory medicine;
D O I
10.1515/cclm-2025-0055
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
引用
收藏
页数:3
相关论文
共 50 条
  • [41] A comparison of the diagnostic ability of large language models in challenging clinical cases
    Khan, Maria Palwasha
    O'Sullivan, Eoin Daniel
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [42] Study Tests Large Language Models' Ability to Answer Clinical Questions
    Harris, Emily
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2023, 330 (06): : 496 - 496
  • [43] Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases
    Ando, Risako
    Morishita, Takanobu
    Abe, Hirohiko
    Mineshima, Koji
    Okada, Mitsuhiro
    arXiv, 2023,
  • [44] The COT COLLECTION: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
    Kim, Seungone
    Jool, Se June
    Kim, Doyoung
    Jang, Joel
    Ye, Seonghyeon
    Shin, Jamin
    Seo, Minjoon
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12685 - 12708
  • [45] Evaluation and Analysis of the Chinese Semantic Dependency Understanding Ability of Large Language Models
    Shen, Zizhuo
    Li, Wei
    Shao, Yanqiu
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 92 - 104
  • [46] Beyond Topic Modeling: Comparative Evaluation of Topic Interpretation by Large Language Models
    de Melo, Tiago
    Merialdo, Paolo
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 4, INTELLISYS 2024, 2024, 1068 : 215 - 230
  • [47] Expert evaluation of large language models for clinical dialogue summarization
    Navarro, David Fraile
    Coiera, Enrico
    Hambly, Thomas W.
    Triplett, Zoe
    Asif, Nahyan
    Susanto, Anindya
    Chowdhury, Anamika
    Lorenzo, Amaya Azcoaga
    Dras, Mark
    Berkovsky, Shlomo
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [48] Multimodal large language models address clinical queries in laryngeal cancer surgery: a comparative evaluation of image interpretation across different models
    Liang, Bingyu
    Gao, Yifan
    Wang, Taibao
    Zhang, Lei
    Wang, Qin
    INTERNATIONAL JOURNAL OF SURGERY, 2025, 111 (03) : 2727 - 2730
  • [49] Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
    Levy, Mosh
    Jacoby, Alon
    Goldberg, Yoav
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 15339 - 15353
  • [50] Evaluation and Analysis of Large Language Models for Clinical Text Augmentation and Generation
    Latif, Atif
    Kim, Jihie
    IEEE ACCESS, 2024, 12 : 48987 - 48996