Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models

被引：0

作者：

Li, Ningke ^{[1
]}

Li, Yuekang ^{[2
]}

Liu, Yi ^{[3
]}

Shi, Ling ^{[3
]}

Wang, Kailong ^{[1
]}

Wang, Haoyu ^{[1
]}

机构：

[1] Huazhong University of Science and Technology, Wuhan, China

[2] The University of New South Wales, Sydney, Australia

[3] Nanyang Technological University, Singapore, Singapore

来源：

Proceedings of the ACM on Programming Languages | 2024年 / 8卷 / OOPSLA2期

关键词：

Large language models (LLMs) have revolutionized language processing; but face critical challenges with security; privacy; and generating hallucinations - coherent but factually inaccurate outputs. A major issue is fact-conflicting hallucination (FCH); where LLMs produce content contradicting ground truth facts. Addressing FCH is difficult due to two key challenges: 1) Automatically constructing and updating benchmark datasets is hard; as existing methods rely on manually curated static benchmarks that cannot cover the broad; evolving spectrum of FCH cases. 2) Validating the reasoning behind LLM outputs is inherently difficult; especially for complex logical relations. To tackle these challenges; we introduce a novel logic-programming-aided metamorphic testing technique for FCH detection. We develop an extensive and extensible framework that constructs a comprehensive factual knowledge base by crawling sources like Wikipedia; seamlessly integrated into Drowzee. Using logical reasoning rules; we transform and augment this knowledge into a large set of test cases with ground truth answers. We test LLMs on these cases through template-based prompts; requiring them to provide reasoned answers. To validate their reasoning; we propose two semantic-aware oracles that assess the similarity between the semantic structures of the LLM answers and ground truth. Our approach automatically generates useful test cases and identifies hallucinations across six LLMs within nine domains; with hallucination rates ranging from 24.7% to 59.8%. Key findings include LLMs struggling with temporal concepts; out-of-distribution knowledge; and lack of logical reasoning capabilities. The results show that logic-based test cases generated by Drowzee effectively trigger and detect hallucinations. To further mitigate the identified FCHs; we explored model editing techniques; which proved effective on a small scale (with edits to fewer than 1000 knowledge pieces). Our findings emphasize the need for continued community efforts to detect and mitigate model hallucinations. © 2024 Copyright held by the owner/author(s);

D O I：

10.1145/3689776

中图分类号：

学科分类号：

摘要：

引用

下载

共 50 条

[1] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Chen, Yuyan
Fu, Qiang
Yuan, Yichen
Wen, Zhihao
Fan, Ge
Liu, Dayiheng
Zhang, Dongmei
Li, Zhixu
Xiao, Yanghua
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 245 - 255
[2] HILL: A Hallucination Identifier for Large Language Models
Leiser, Florian
Eckhardt, Sven
Leuthe, Valentin
Knaeble, Merlin
Maedche, Alexander
Schwabe, Gerhard
Sunyaev, Ali
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
[3] Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing
Tsigkanos, Christos
Rani, Pooja
Mueller, Sebastian
Kehrer, Timo
2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 678 - 682
[4] Woodpecker: hallucination correction for multimodal large language models
Yin, Shukang
Fu, Chaoyou
Zhao, Sirui
Xu, Tong
Wang, Hao
Sui, Dianbo
Shen, Yunhang
Li, Ke
Sun, Xing
Chen, Enhong
Science China Information Sciences, 2024, 67 (12)
[5] Woodpecker: hallucination correction for multimodal large language models
Shukang YIN
Chaoyou FU
Sirui ZHAO
Tong XU
Hao WANG
Dianbo SUI
Yunhang SHEN
Ke LI
Xing SUN
Enhong CHEN
Science China(Information Sciences), 2024, 67 (12) : 52 - 64
[6] Mitigating Factual Inconsistency and Hallucination in Large Language Models
Muneeswaran, I
Shankar, Advaith
Varun, V.
Gopalakrishnan, Saisubramaniam
Vaddina, Vishal
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1169 - 1170
[7] Untangling Emotional Threads: Hallucination Networks of Large Language Models
Goodarzi, Mahsa
Venkatakrishnan, Radhakrishnan
Canbaz, M. Abdullah
COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 1, COMPLEX NETWORKS 2023, 2024, 1141 : 202 - 214
[8] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
Tsuruta, Hiromi
Sakaguchi, Rio
Research Square,
[9] Evaluating Natural Language Inference Models: A Metamorphic Testing Approach
Jiang, Mingyue
Bao, Houzhen
Tu, Kaiyi
Zhang, Xiao-Yi
Ding, Zuohua
2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 220 - 230
[10] Metamorphic Malware Evolution: The Potential and Peril of Large Language Models
Madani, Pooria
2023 5TH IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS, TPS-ISA, 2023, : 74 - 81

← 1 2 3 4 5 →