Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models

被引:0
|
作者
Wu, Junfei [1 ,2 ]
Liu, Qiang [1 ,2 ]
Wang, Ding [1 ,2 ]
Zhang, Jinghao [1 ,2 ]
Wu, Shu [1 ,2 ]
Wang, Liang [1 ,2 ]
Tan, Tieniu [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, New Lab Pattern Recognit NLPR, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Nanjing Univ, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object hallucination has been an Achilles' heel which hinders the broader applications of large vision-language models (LVLMs). Object hallucination refers to the phenomenon that the LVLMs claim non-existent objects in the image. To mitigate the object hallucinations, instruction tuning and external model-based detection methods have been proposed, which either require large-scare computational resources or depend on the detection result of external models. However, there remains an under-explored field to utilize the LVLM itself to alleviate object hallucinations. In this work, we adopt the intuition that the LVLM tends to respond logically consistently for existent objects but inconsistently for hallucinated objects. Therefore, we propose a Logical Closed Loop-based framework for Object Hallucination Detection and Mitigation, namely LogicCheckGPT. In specific, we devise logical consistency probing to raise questions with logical correlations, inquiring about attributes from objects and vice versa. Whether their responses can form a logical closed loop serves as an indicator of object hallucination. As a plug-and-play method, it can be seamlessly applied to all existing LVLMs. Comprehensive experiments conducted on three benchmarks across four LVLMs have demonstrated significant improvements brought by our method, indicating its effectiveness and generality(1).
引用
收藏
页码:6944 / 6962
页数:19
相关论文
共 50 条
  • [1] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
    Leng, Sicong
    Zhang, Hang
    Chen, Guanzheng
    Li, Xin
    Lug, Shijian
    Miao, Chunyan
    Bing, Lidong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13872 - 13882
  • [2] Evaluating Object Hallucination in Large Vision-Language Models
    Li, Yifan
    Du, Yifan
    Zhou, Kun
    Wang, Jinpeng
    Zhao, Wayne Xin
    Wen, Ji-Rong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 292 - 305
  • [3] Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
    Wang, Xintong
    Pan, Jingheng
    Ding, Liang
    Biemann, Chris
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 15840 - 15853
  • [4] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
    Zhang, Jinrui
    Wang, Teng
    Zhang, Haigang
    Lu, Ping
    Zheng, Feng
    COMPUTER VISION - ECCV 2024, PT XXXVII, 2025, 15095 : 196 - 213
  • [5] Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
    Kim, Minchan
    Kim, Minyeong
    Bae, Junik
    Choi, Suhwan
    Kim, Sungkyung
    Change, Buru
    COMPUTER VISION - ECCV 2024, PT LXXXVI, 2025, 15144 : 236 - 252
  • [6] Attention Prompting on Image for Large Vision-Language Models
    Yu, Runpeng
    Yu, Weihao
    Wang, Xinchao
    COMPUTER VISION - ECCV 2024, PT XXX, 2025, 15088 : 251 - 268
  • [7] Effectiveness assessment of recent large vision-language models
    Yao Jiang
    Xinyu Yan
    Ge-Peng Ji
    Keren Fu
    Meijun Sun
    Huan Xiong
    Deng-Ping Fan
    Fahad Shahbaz Khan
    Visual Intelligence, 2 (1):
  • [8] Evaluating Attribute Comprehension in Large Vision-Language Models
    Zhang, Haiwen
    Yang, Zixi
    Liu, Yuanzhi
    Wang, Xinran
    He, Zheqi
    Liang, Kongming
    Ma, Zhanyu
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 98 - 113
  • [9] On Evaluating Adversarial Robustness of Large Vision-Language Models
    Zhao, Yunqing
    Pang, Tianyu
    Du, Chao
    Yang, Xiao
    Li, Chongxuan
    Cheung, Ngai-Man
    Lin, Min
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Detecting and Preventing Hallucinations in Large Vision Language Models
    Gunjal, Anisha
    Yin, Jihan
    Bas, Erhan
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18135 - 18143