Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models

被引:0
|
作者
Li, Ningke [1 ]
Li, Yuekang [2 ]
Liu, Yi [3 ]
Shi, Ling [3 ]
Wang, Kailong [1 ]
Wang, Haoyu [1 ]
机构
[1] Huazhong University of Science and Technology, Wuhan, China
[2] The University of New South Wales, Sydney, Australia
[3] Nanyang Technological University, Singapore, Singapore
关键词
Large language models (LLMs) have revolutionized language processing; but face critical challenges with security; privacy; and generating hallucinations - coherent but factually inaccurate outputs. A major issue is fact-conflicting hallucination (FCH); where LLMs produce content contradicting ground truth facts. Addressing FCH is difficult due to two key challenges: 1) Automatically constructing and updating benchmark datasets is hard; as existing methods rely on manually curated static benchmarks that cannot cover the broad; evolving spectrum of FCH cases. 2) Validating the reasoning behind LLM outputs is inherently difficult; especially for complex logical relations. To tackle these challenges; we introduce a novel logic-programming-aided metamorphic testing technique for FCH detection. We develop an extensive and extensible framework that constructs a comprehensive factual knowledge base by crawling sources like Wikipedia; seamlessly integrated into Drowzee. Using logical reasoning rules; we transform and augment this knowledge into a large set of test cases with ground truth answers. We test LLMs on these cases through template-based prompts; requiring them to provide reasoned answers. To validate their reasoning; we propose two semantic-aware oracles that assess the similarity between the semantic structures of the LLM answers and ground truth. Our approach automatically generates useful test cases and identifies hallucinations across six LLMs within nine domains; with hallucination rates ranging from 24.7% to 59.8%. Key findings include LLMs struggling with temporal concepts; out-of-distribution knowledge; and lack of logical reasoning capabilities. The results show that logic-based test cases generated by Drowzee effectively trigger and detect hallucinations. To further mitigate the identified FCHs; we explored model editing techniques; which proved effective on a small scale (with edits to fewer than 1000 knowledge pieces). Our findings emphasize the need for continued community efforts to detect and mitigate model hallucinations. © 2024 Copyright held by the owner/author(s);
D O I
10.1145/3689776
中图分类号
学科分类号
摘要
引用
下载
收藏
相关论文
共 50 条
  • [1] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
    Chen, Yuyan
    Fu, Qiang
    Yuan, Yichen
    Wen, Zhihao
    Fan, Ge
    Liu, Dayiheng
    Zhang, Dongmei
    Li, Zhixu
    Xiao, Yanghua
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 245 - 255
  • [2] HILL: A Hallucination Identifier for Large Language Models
    Leiser, Florian
    Eckhardt, Sven
    Leuthe, Valentin
    Knaeble, Merlin
    Maedche, Alexander
    Schwabe, Gerhard
    Sunyaev, Ali
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [3] Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing
    Tsigkanos, Christos
    Rani, Pooja
    Mueller, Sebastian
    Kehrer, Timo
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 678 - 682
  • [4] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    Science China Information Sciences, 2024, 67 (12)
  • [5] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64
  • [6] Mitigating Factual Inconsistency and Hallucination in Large Language Models
    Muneeswaran, I
    Shankar, Advaith
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1169 - 1170
  • [7] Untangling Emotional Threads: Hallucination Networks of Large Language Models
    Goodarzi, Mahsa
    Venkatakrishnan, Radhakrishnan
    Canbaz, M. Abdullah
    COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 1, COMPLEX NETWORKS 2023, 2024, 1141 : 202 - 214
  • [8] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
    Tsuruta, Hiromi
    Sakaguchi, Rio
    Research Square,
  • [9] Evaluating Natural Language Inference Models: A Metamorphic Testing Approach
    Jiang, Mingyue
    Bao, Houzhen
    Tu, Kaiyi
    Zhang, Xiao-Yi
    Ding, Zuohua
    2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 220 - 230
  • [10] Metamorphic Malware Evolution: The Potential and Peril of Large Language Models
    Madani, Pooria
    2023 5TH IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS, TPS-ISA, 2023, : 74 - 81