Vision-Aware Language Reasoning for Referring Image Segmentation

被引：0

作者：

Fayou Xu

Bing Luo

Chao Zhang

Li Xu

Mingxing Pu

Bo Li

机构：

[1] Xihua University,School of Computer and Software Engineering

[2] Sichuan Police College,Key Laboratory of Intelligent Policing

[3] Xihua University,School of Science

来源：

Neural Processing Letters | 2023年 / 55卷

关键词：

Referring image segmentation; Vision and language; Explainable language-structure reasoning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Referring image segmentation is a multimodal joint task that aims to segment linguistically indicated objects from images in paired expressions and images. However, the diversity of language annotations trends to result in semantic ambiguity, which makes the semantic representation of language feature encoding imprecise. Existing methods ignore the correction of language encoding module, so that the semantic error of language features cannot be improved in the subsequent process, resulting in semantic deviation. To this end, we propose a vision-aware language reasoning model. Intuitively, the segmentation result can be used to guide the reconstruction of language features, which could be expressed as a tree-structured recursive process. Specifically, we designed a language reasoning encoding module and a mask loopback optimization module to optimize the language encoding tree. The feature weights of tree nodes are learned through backpropagation. In order to overcome the problem that local language words and visual regions are easily introduced into noise regions in the traditional attention module, we use the global language prior information to calculate the importance of different words to further weight the visual region features, which could be embodied as language-aware vision attention module. Our experimental results on four benchmark datasets show that the proposed method achieves performance improvement.

引用

页码：11313 / 11331

页数：18

共 50 条

[31] Language as Queries for Referring Video Object Segmentation
Wu, Jiannan
Jiang, Yi
Sun, Peize
Yuan, Zehuan
Luo, Ping
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4964 - 4974
[32] RRSIS: Referring Remote Sensing Image Segmentation
Yuan, Zhenghang
Mou, Lichao
Hua, Yuansheng
Zhu, Xiao Xiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
[33] Referring Image Segmentation Using Text Supervision
Liu, Fang
Liu, Yuhao
Kong, Yuqiu
Xu, Ke
Zhang, Lihe
Yin, Baocai
Hancke, Gerhard
Lau, Rynson
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
[34] Contrastive Grouping with Transformer for Referring Image Segmentation
Tang, Jiajin
Zheng, Ge
Shi, Cheng
Yang, Sibei
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23570 - 23580
[35] Recurrent Multimodal Interaction for Referring Image Segmentation
Liu, Chenxi
Lin, Zhe
Shen, Xiaohui
Yang, Jimei
Lu, Xin
Yuille, Alan
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1280 - 1289
[36] Structured Attention Network for Referring Image Segmentation
Lin, Liang
Yan, Pengxiang
Xu, Xiaoqian
Yang, Sibei
Zeng, Kun
Li, Guanbin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
[37] Referring Image Segmentation by Generative Adversarial Learning
Qiu, Shuang
Zhao, Yao
Jiao, Jianbo
Wei, Yunchao
Wei, Shikui
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1333 - 1344
[38] PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Liu, Jiang
Ding, Hui
Cai, Zhaowei
Zhang, Yuting
Satzoda, Ravi Kumar
Mahadevan, Vijay
Manmatha, R.
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18653 - 18663
[39] CRIS: CLIP-Driven Referring Image Segmentation
Wang, Zhaoqing
Lu, Yu
Li, Qiang
Tao, Xunqiang
Guo, Yandong
Gong, Mingming
Liu, Tongliang
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11676 - 11685
[40] Attentive Excitation and Aggregation for Bilingual Referring Image Segmentation
Zhou, Qianli
Hui, Tianrui
Wang, Rong
Hu, Haimiao
Liu, Si
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (02)

← 1 2 3 4 5 →