Vision-Aware Language Reasoning for Referring Image Segmentation

被引:0
|
作者
Fayou Xu
Bing Luo
Chao Zhang
Li Xu
Mingxing Pu
Bo Li
机构
[1] Xihua University,School of Computer and Software Engineering
[2] Sichuan Police College,Key Laboratory of Intelligent Policing
[3] Xihua University,School of Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Referring image segmentation; Vision and language; Explainable language-structure reasoning;
D O I
暂无
中图分类号
学科分类号
摘要
Referring image segmentation is a multimodal joint task that aims to segment linguistically indicated objects from images in paired expressions and images. However, the diversity of language annotations trends to result in semantic ambiguity, which makes the semantic representation of language feature encoding imprecise. Existing methods ignore the correction of language encoding module, so that the semantic error of language features cannot be improved in the subsequent process, resulting in semantic deviation. To this end, we propose a vision-aware language reasoning model. Intuitively, the segmentation result can be used to guide the reconstruction of language features, which could be expressed as a tree-structured recursive process. Specifically, we designed a language reasoning encoding module and a mask loopback optimization module to optimize the language encoding tree. The feature weights of tree nodes are learned through backpropagation. In order to overcome the problem that local language words and visual regions are easily introduced into noise regions in the traditional attention module, we use the global language prior information to calculate the importance of different words to further weight the visual region features, which could be embodied as language-aware vision attention module. Our experimental results on four benchmark datasets show that the proposed method achieves performance improvement.
引用
收藏
页码:11313 / 11331
页数:18
相关论文
共 50 条
  • [31] Language as Queries for Referring Video Object Segmentation
    Wu, Jiannan
    Jiang, Yi
    Sun, Peize
    Yuan, Zehuan
    Luo, Ping
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4964 - 4974
  • [32] RRSIS: Referring Remote Sensing Image Segmentation
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
  • [33] Referring Image Segmentation Using Text Supervision
    Liu, Fang
    Liu, Yuhao
    Kong, Yuqiu
    Xu, Ke
    Zhang, Lihe
    Yin, Baocai
    Hancke, Gerhard
    Lau, Rynson
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
  • [34] Contrastive Grouping with Transformer for Referring Image Segmentation
    Tang, Jiajin
    Zheng, Ge
    Shi, Cheng
    Yang, Sibei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23570 - 23580
  • [35] Recurrent Multimodal Interaction for Referring Image Segmentation
    Liu, Chenxi
    Lin, Zhe
    Shen, Xiaohui
    Yang, Jimei
    Lu, Xin
    Yuille, Alan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1280 - 1289
  • [36] Structured Attention Network for Referring Image Segmentation
    Lin, Liang
    Yan, Pengxiang
    Xu, Xiaoqian
    Yang, Sibei
    Zeng, Kun
    Li, Guanbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
  • [37] Referring Image Segmentation by Generative Adversarial Learning
    Qiu, Shuang
    Zhao, Yao
    Jiao, Jianbo
    Wei, Yunchao
    Wei, Shikui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1333 - 1344
  • [38] PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
    Liu, Jiang
    Ding, Hui
    Cai, Zhaowei
    Zhang, Yuting
    Satzoda, Ravi Kumar
    Mahadevan, Vijay
    Manmatha, R.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18653 - 18663
  • [39] CRIS: CLIP-Driven Referring Image Segmentation
    Wang, Zhaoqing
    Lu, Yu
    Li, Qiang
    Tao, Xunqiang
    Guo, Yandong
    Gong, Mingming
    Liu, Tongliang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11676 - 11685
  • [40] Attentive Excitation and Aggregation for Bilingual Referring Image Segmentation
    Zhou, Qianli
    Hui, Tianrui
    Wang, Rong
    Hu, Haimiao
    Liu, Si
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (02)