Probing the Consistency of Situational Information Extraction with Large Language Models: A Case Study on Crisis Computing

被引:0
|
作者
Salfinger, Andrea [1 ]
Snidaro, Lauro [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy
基金
奥地利科学基金会;
关键词
Large Language Models; Crisis Management; Situation Awareness; Soft Fusion; High-Level Information Fusion;
D O I
10.1109/CogSIMA61085.2024.10553903
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recently introduced foundation models for language modeling, also known as Large Language Models (LLMs), have demonstrated breakthrough capabilities on text summarization and contextualized natural language processing. However, these also suffer from inherent deficiencies like the occasional generation of factually wrong information, known as hallucinations, and a weak consistency of produced answers strongly varying with the exact phrasing of their input query, i.e., prompt. Hence, this raises the question whether and how LLMs could replace or complement traditional information extraction and fusion modules in information fusion pipelines involving textual input sources. We empirically examine this question on a case study from crisis computing, taken from the established CrisisFacts benchmark dataset, by probing an LLM's situation understanding and summarization capabilities on the target task of extracting information relevant for establishing crisis situation awareness from social media corpora. Since social media messages are exchanged in real-time, typically targeting human readers aware of the situational context, this domain represents a prime testbed for evaluating LLMs' situational information extraction capabilities. In this work, we specifically investigate the consistency of extracted information across different model configurations and different but semantically similar prompts, which represents a crucial prerequisite for a reliable and trustworthy information extraction component.
引用
收藏
页码:91 / 98
页数:8
相关论文
共 50 条
  • [41] Scalable information extraction from free text electronic health records using large language models
    Gu, Bowen
    Shao, Vivian
    Liao, Ziqian
    Carducci, Valentina
    Brufau, Santiago Romero
    Yang, Jie
    Desai, Rishi J.
    BMC MEDICAL RESEARCH METHODOLOGY, 2025, 25 (01)
  • [42] LLM-IE: a python']python package for biomedical generative information extraction with large language models
    Hsu, Enshuo
    Roberts, Kirk
    JAMIA OPEN, 2025, 8 (02)
  • [43] Enhancing Visual Information Extraction with Large Language Models Through Layout-Aware Instruction Tuning
    Li, Teng
    Wang, Jiapeng
    Jin, Lianwen
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 276 - 289
  • [44] From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
    Ohmer, Xenia
    Bruni, Elia
    Hupke, Dieuwke
    COMPUTATIONAL LINGUISTICS, 2024, 50 (04) : 1507 - 1556
  • [45] Using Large Language Models for Math Information Retrieval
    Mansouri, Behrooz
    Maarefdoust, Reihaneh
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2693 - 2697
  • [46] Assessing the Neuropsychology Information Base of Large Language Models
    Kronenberger, Oscar
    Bullinger, Leah
    Kaser, Alyssa N.
    Cullum, Munro C.
    Schaffert, Jeffrey
    Harder, Lana
    Lacritz, Laura
    ARCHIVES OF CLINICAL NEUROPSYCHOLOGY, 2024, 39 (07) : 1214 - 1215
  • [47] Large Language Models for Tracking Reliability of Information Sources
    Zaroukian, Erin
    ARTIFICIAL INTELLIGENCE IN HCI, PT III, AI-HCI 2024, 2024, 14736 : 158 - 169
  • [48] TIME-UIE: Tourism-oriented figure information model and unified information extraction via large language models
    Fan, Zhanling
    Chen, Chongcheng
    Luo, Haifeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 278
  • [49] Adopting Pre-trained Large Language Models for Regional Language Tasks: A Case Study
    Gaikwad, Harsha
    Kiwelekar, Arvind
    Laddha, Manjushree
    Shahare, Shashank
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 15 - 25
  • [50] Assessing dimensions of thought disorder with large language models: The tradeoff of accuracy and consistency
    Pugh, Samuel L.
    Chandler, Chelsea
    Cohen, Alex S.
    Diaz-Asper, Catherine
    Elvevag, Brita
    Foltz, Peter W.
    PSYCHIATRY RESEARCH, 2024, 341