Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models

被引：0

作者：

Wu, Junfei ^{[1
,2
]}

Liu, Qiang ^{[1
,2
]}

Wang, Ding ^{[1
,2
]}

Zhang, Jinghao ^{[1
,2
]}

Wu, Shu ^{[1
,2
]}

Wang, Liang ^{[1
,2
]}

Tan, Tieniu ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, New Lab Pattern Recognit NLPR, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Nanjing Univ, Nanjing, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object hallucination has been an Achilles' heel which hinders the broader applications of large vision-language models (LVLMs). Object hallucination refers to the phenomenon that the LVLMs claim non-existent objects in the image. To mitigate the object hallucinations, instruction tuning and external model-based detection methods have been proposed, which either require large-scare computational resources or depend on the detection result of external models. However, there remains an under-explored field to utilize the LVLM itself to alleviate object hallucinations. In this work, we adopt the intuition that the LVLM tends to respond logically consistently for existent objects but inconsistently for hallucinated objects. Therefore, we propose a Logical Closed Loop-based framework for Object Hallucination Detection and Mitigation, namely LogicCheckGPT. In specific, we devise logical consistency probing to raise questions with logical correlations, inquiring about attributes from objects and vice versa. Whether their responses can form a logical closed loop serves as an indicator of object hallucination. As a plug-and-play method, it can be seamlessly applied to all existing LVLMs. Comprehensive experiments conducted on three benchmarks across four LVLMs have demonstrated significant improvements brought by our method, indicating its effectiveness and generality(1).

引用

页码：6944 / 6962

页数：19

共 50 条

[41] Consistent prompt learning for vision-language models
Zhang, Yonggang
Tian, Xinmei
KNOWLEDGE-BASED SYSTEMS, 2025, 310
[42] Conceptual Codebook Learning for Vision-Language Models
Zhang, Yi
Yu, Ke
Wu, Siqi
He, Zhihai
COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 235 - 251
[43] Vision-Language Models for Robot Success Detection
Luo, Fiona
THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23750 - 23752
[44] Exploring Vision-Language Models for Imbalanced Learning
Wang Y.
Yu Z.
Wang J.
Heng Q.
Chen H.
Ye W.
Xie R.
Xie X.
Zhang S.
International Journal of Computer Vision, 2024, 132 (01) : 224 - 237
[45] Adversarial Prompt Tuning for Vision-Language Models
Zhang, Jiaming
Ma, Xingjun
Wang, Xin
Qiu, Lingyu
Wang, Jiaqi
Jiang, Yu-Gang
Sang, Jitao
COMPUTER VISION - ECCV 2024, PT XLV, 2025, 15103 : 56 - 72
[46] Task Bias in Contrastive Vision-Language Models
Menon, Sachit
Chandratreya, Ishaan Preetam
Vondrick, Carl
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (06) : 2026 - 2040
[47] Task Residual for Tuning Vision-Language Models
Yu, Tao
Lu, Zhihe
Jin, Xin
Chen, Zhibo
Wang, Xinchao
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10899 - 10909
[48] Perceptual Grouping in Contrastive Vision-Language Models
Ranasinghe, Kanchana
McKinzie, Brandon
Ravi, Sachin
Yang, Yinfei
Toshev, Alexander
Shlens, Jonathon
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5548 - 5561
[49] Adventures of Trustworthy Vision-Language Models: A Survey
Vatsa, Mayank
Jain, Anubhooti
Singh, Richa
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22650 - 22658
[50] Equivariant Similarity for Vision-Language Foundation Models
Wang, Tan
Lin, Kevin
Li, Linjie
Lin, Chung-Ching
Yang, Zhengyuan
Zhang, Hanwang
Liu, Zicheng
Wang, Lijuan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11964 - 11974

← 1 2 3 4 5 →