Comprehensive testing of large language models for extraction of structured data in pathology

被引:0
|
作者
Bastian Grothey [1 ]
Jan Odenkirchen [2 ]
Adnan Brkic [1 ]
Birgid Schömig-Markiefka [1 ]
Alexander Quaas [1 ]
Reinhard Büttner [1 ]
Yuri Tolkach [1 ]
机构
[1] University Hospital Cologne,Institute of Pathology
[2] University of Cologne,Medical Faculty
来源
关键词
D O I
10.1038/s43856-025-00808-8
中图分类号
学科分类号
摘要
Pathology departments produce many diagnostic reports as free text, which is hard to analyze or use in research and computer projects. Converting this free text into more standard organized information like test results or diagnoses, makes it easier to use. This task often requires human experts and takes time. Large language models (LLMs), which are advanced computer systems designed to understand and generate human-like text, might simplify this process. Here, we tested six LLMs, including freely available models and the commercial GPT-4 model, using 579 pathology reports in English and German. Our results show that freely available models can perform as well as commercial, providing a cheaper solution while avoiding privacy concerns. The shared dataset will support future research in pathology data processing.
引用
下载
收藏
相关论文
共 50 条
  • [31] A critical examination and suggestions for large language models for structured reporting in radiology
    Partha Pratim Ray
    La radiologia medica, 2023, 128 : 1441 - 1442
  • [32] Radiology, structured reporting and large language models: who is running faster?
    Mallio, Carlo A. A.
    Sertorio, Andrea Carlomaria
    Bernetti, Caterina
    Zobel, Bruno Beomonte
    RADIOLOGIA MEDICA, 2023, 128 (11): : 1443 - 1444
  • [33] A critical examination and suggestions for large language models for structured reporting in radiology
    Ray, Partha Pratim
    RADIOLOGIA MEDICA, 2023, 128 (11): : 1441 - 1442
  • [34] Radiology, structured reporting and large language models: who is running faster?
    Carlo A. Mallio
    Andrea Carlomaria Sertorio
    Caterina Bernetti
    Bruno Beomonte Zobel
    La radiologia medica, 2023, 128 : 1443 - 1444
  • [35] Fluctuation-Based Adaptive Structured Pruning for Large Language Models
    An, Yongqi
    Zhao, Xu
    Yu, Tao
    Tang, Ming
    Wang, Jinqiao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 10865 - 10873
  • [36] Establishing priorities for implementation of large language models in pathology and laboratory medicine
    Arvisais-Anhalt, Simone
    Gonias, Steven L.
    Murray, Sara G.
    ACADEMIC PATHOLOGY, 2024, 11 (01):
  • [37] Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
    Sui, Yuan
    Zhou, Mengyu
    Zhou, Mingjie
    Han, Shi
    Zhang, Dongmei
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 645 - 654
  • [38] Context Compression and Extraction: Efficiency Inference of Large Language Models
    Zhou, Junyao
    Du, Ruiqing
    Tan, Yushan
    Yang, Jintao
    Yang, Zonghao
    Luo, Wei
    Luo, Zhunchen
    Zhou, Xian
    Hu, Wenpeng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14875 : 221 - 232
  • [39] An Exploratory Study on Using Large Language Models for Mutation Testing
    Wang, Bo
    Chen, Mingda
    Lin, Youfang
    Papadakis, Mike
    Zhang, Jie M.
    arXiv,
  • [40] Software Testing With Large Language Models: Survey, Landscape, and Vision
    Wang, Junjie
    Huang, Yuchao
    Chen, Chunyang
    Liu, Zhe
    Wang, Song
    Wang, Qing
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (04) : 911 - 936