Global and Local Interactive Perception Network for Referring Image Segmentation

被引:0
|
作者
Liu, Jing [1 ]
Tan, Hongchen [1 ]
Hu, Yongli [1 ]
Sun, Yanfeng [1 ]
Wang, Huasheng [2 ]
Yin, Baocai [1 ]
机构
[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[2] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 3AT, Wales
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Image segmentation; Task analysis; Feature extraction; Visualization; Detectors; Object detection; Attention mechanism; global perception; local perception; referring image segmentation (RIS); transformer;
D O I
10.1109/TNNLS.2023.3308550
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The effective modal fusion and perception between the language and the image are necessary for inferring the reference instance in the referring image segmentation (RIS) task. In this article, we propose a novel RIS network, the global and local interactive perception network (GLIPN), to enhance the quality of modal fusion between the language and the image from the local and global perspectives. The core of GLIPN is the global and local interactive perception (GLIP) scheme. Specifically, the GLIP scheme contains the local perception module (LPM) and the global perception module (GPM). The LPM is designed to enhance the local modal fusion by the correspondence between word and image local semantics. The GPM is designed to inject the global structured semantics of images into the modal fusion process, which can better guide the word embedding to perceive the whole image's global structure. Combined with the local-global context semantics fusion, extensive experiments on several benchmark datasets demonstrate the advantage of the proposed GLIPN over most state-of-the-art approaches.
引用
下载
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] Global Selection and Local Attention Network for Referring Image Segmentation
    Ding, Haixin
    Zhang, Shengchuan
    Cao, Liujuan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 284 - 295
  • [2] Local-global coordination with transformers for referring image segmentation
    Liu, Fang
    Kong, Yuqiu
    Zhang, Lihe
    Feng, Guang
    Yin, Baocai
    NEUROCOMPUTING, 2023, 522 : 39 - 52
  • [3] Structured Attention Network for Referring Image Segmentation
    Lin, Liang
    Yan, Pengxiang
    Xu, Xiaoqian
    Yang, Sibei
    Zeng, Kun
    Li, Guanbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
  • [4] Zero-shot Referring Image Segmentation with Global-Local Context Features
    Yu, Seonghoon
    Seo, Paul Hongsuck
    Son, Jeany
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19456 - 19465
  • [5] Interactive image segmentation combining global seeding and sparse local reconstruction
    Jianwu Long
    Yuanqin Liu
    Kaixin Zhang
    Shuang Chen
    Qi Luo
    Pattern Analysis and Applications, 2025, 28 (2)
  • [6] Dual Convolutional LSTM Network for Referring Image Segmentation
    Ye, Linwei
    Liu, Zhi
    Wang, Yang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3224 - 3235
  • [7] Structured Multimodal Fusion Network for Referring Image Segmentation
    Xue, Mingcheng
    Liu, Yu
    Xu, Kaiping
    Zhang, Haiyang
    Yu, Chengyang
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 36 - 47
  • [8] A CONTEXT-BASED NETWORK FOR REFERRING IMAGE SEGMENTATION
    Li, Xinyu
    Liu, Yu
    Xu, Kaiping
    Zhao, Zhehuan
    Liu, Sipei
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1436 - 1440
  • [9] Query Reconstruction Network for Referring Expression Image Segmentation
    Shi, Hengcan
    Li, Hongliang
    Wu, Qingbo
    Ngan, King Ngi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 995 - 1007
  • [10] Bilateral Knowledge Interaction Network for Referring Image Segmentation
    Ding, Haixin
    Zhang, Shengchuan
    Wu, Qiong
    Yu, Songlin
    Hu, Jie
    Cao, Liujuan
    Ji, Rongrong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2966 - 2977