Comprehensive testing of large language models for extraction of structured data in pathology

被引:0
|
作者
Bastian Grothey [1 ]
Jan Odenkirchen [2 ]
Adnan Brkic [1 ]
Birgid Schömig-Markiefka [1 ]
Alexander Quaas [1 ]
Reinhard Büttner [1 ]
Yuri Tolkach [1 ]
机构
[1] University Hospital Cologne,Institute of Pathology
[2] University of Cologne,Medical Faculty
来源
关键词
D O I
10.1038/s43856-025-00808-8
中图分类号
学科分类号
摘要
Pathology departments produce many diagnostic reports as free text, which is hard to analyze or use in research and computer projects. Converting this free text into more standard organized information like test results or diagnoses, makes it easier to use. This task often requires human experts and takes time. Large language models (LLMs), which are advanced computer systems designed to understand and generate human-like text, might simplify this process. Here, we tested six LLMs, including freely available models and the commercial GPT-4 model, using 579 pathology reports in English and German. Our results show that freely available models can perform as well as commercial, providing a cheaper solution while avoiding privacy concerns. The shared dataset will support future research in pathology data processing.
引用
下载
收藏
相关论文
共 50 条
  • [41] LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models
    Feng, Xiaoning
    Han, Xiaohong
    Chen, Simin
    Yang, Wei
    ACM Transactions on Software Engineering and Methodology, 2024, 33 (07)
  • [42] Leveraging Large Language Models to Improve REST API Testing
    Kim, Myeongsoo
    Stennett, Tyler
    Shah, Dhruv
    Sinha, Saurabh
    Orso, Alessandro
    2024 IEEE/ACM 46TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS, ICSE-NIER 2024, 2024, : 37 - 41
  • [43] Large Language Models for Code: Security Hardening and Adversarial Testing
    He, Jingxuan
    Vechev, Martin
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1865 - 1879
  • [44] A Comprehensive Evaluation of Large Language Models for Turkish Abstractive Dialogue Summarization
    Buyuk, Osman
    IEEE ACCESS, 2024, 12 : 124391 - 124401
  • [45] A comprehensive review of large language models: issues and solutions in learning environments
    Tariq Shahzad
    Tehseen Mazhar
    Muhammad Usman Tariq
    Wasim Ahmad
    Khmaies Ouahada
    Habib Hamam
    Discover Sustainability, 6 (1):
  • [46] Extracting Training Data from Large Language Models
    Carlini, Nicholas
    Tramer, Florian
    Wallace, Eric
    Jagielski, Matthew
    Herbert-Voss, Ariel
    Lee, Katherine
    Roberts, Adam
    Brown, Tom
    Song, Dawn
    Erlingsson, Ulfar
    Oprea, Alina
    Raffel, Colin
    PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 2633 - 2650
  • [47] Leveraging Large Language Models for Sensor Data Retrieval
    Berenguer, Alberto
    Morejon, Adriana
    Tomas, David
    Mazon, Jose-Norberto
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [48] How Large Language Models Will Disrupt Data Management
    Fernandez, Raul Castro
    Elmore, Aaron J.
    Franklin, Michael J.
    Krishnan, Sanjay
    Tan, Chenhao
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (11): : 3302 - 3309
  • [49] A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
    Zhang, Chen
    D'Haro, Luis Fernando
    Chen, Yiming
    Zhang, Malu
    Li, Haizhou
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19515 - 19524
  • [50] Unlocking the Black Box? A Comprehensive Exploration of Large Language Models in Rehabilitation
    Bonnechere, Bruno
    AMERICAN JOURNAL OF PHYSICAL MEDICINE & REHABILITATION, 2024, 103 (06) : 532 - 537