Attentive Excitation and Aggregation for Bilingual Referring Image Segmentation

被引:4
|
作者
Zhou, Qianli [1 ]
Hui, Tianrui [2 ,5 ]
Wang, Rong [1 ]
Hu, Haimiao [3 ]
Liu, Si [4 ]
机构
[1] Peoples Publ Secur Univ China, 1 Muxidi Nanli, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, 89 Minzhuang Rd, Beijing, Peoples R China
[3] Beihang Univ, 37 Xueyuan Rd, Beijing, Peoples R China
[4] Beihang Univ, Inst Artificial Intelligence, 37 Xueyuan Rd, Beijing, Peoples R China
[5] Univ Chinese Acad Sci, Sch Cyber Secur, 19 Yuquan Rd, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Bilingual referring segmentation; channel excitation; spatial aggregation;
D O I
10.1145/3446345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of referring image segmentation is to identify the object matched with an input natural language expression. Previous methods only support English descriptions, whereas Chinese is also broadly used around theworld, which limits the potential application of this task. Therefore, we propose to extend existing datasets with Chinese descriptions and preprocessing tools for training and evaluating bilingual referring segmentation models. In addition, previous methods also lack the ability to collaboratively learn channel-wise and spatial-wise cross-modal attention to well align visual and linguistic modalities. To tackle these limitations, we propose a Linguistic Excitation module to excite image channels guided by language information and a Linguistic Aggregation module to aggregate multimodal information based on image-language relationships. Since different levels of features from the visual backbone encode rich visual information, we also propose a Cross-Level Attentive Fusion module to fuse multilevel features gated by language information. Extensive experiments on four English and Chinese benchmarks show that our bilingual referring image segmentation model outperforms previous methods.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Hierarchical collaboration for referring image segmentation
    Zhang, Wei
    Cheng, Zesen
    Chen, Jie
    Gao, Wen
    Neurocomputing, 2025, 613
  • [2] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    IEEE Transactions on Image Processing, 2024, 33 : 1782 - 1794
  • [3] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1782 - 1794
  • [4] RRSIS: Referring Remote Sensing Image Segmentation
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
  • [5] Referring Image Segmentation Using Text Supervision
    Liu, Fang
    Liu, Yuhao
    Kong, Yuqiu
    Xu, Ke
    Zhang, Lihe
    Yin, Baocai
    Hancke, Gerhard
    Lau, Rynson
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22067 - 22077
  • [6] Image Segmentation With Language Referring Expression and Comprehension
    Sun, Jiaxing
    Li, Yujie
    Cai, Jintong
    Lu, Huimin
    Serikawa, Seiichi
    IEEE SENSORS JOURNAL, 2022, 22 (18) : 17406 - 17413
  • [7] Contrastive Grouping with Transformer for Referring Image Segmentation
    Tang, Jiajin
    Zheng, Ge
    Shi, Cheng
    Yang, Sibei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23570 - 23580
  • [8] Recurrent Multimodal Interaction for Referring Image Segmentation
    Liu, Chenxi
    Lin, Zhe
    Shen, Xiaohui
    Yang, Jimei
    Lu, Xin
    Yuille, Alan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1280 - 1289
  • [9] Structured Attention Network for Referring Image Segmentation
    Lin, Liang
    Yan, Pengxiang
    Xu, Xiaoqian
    Yang, Sibei
    Zeng, Kun
    Li, Guanbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
  • [10] Referring Image Segmentation by Generative Adversarial Learning
    Qiu, Shuang
    Zhao, Yao
    Jiao, Jianbo
    Wei, Yunchao
    Wei, Shikui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1333 - 1344