Global and Local Interactive Perception Network for Referring Image Segmentation

被引：0

作者：

Liu, Jing ^{[1
]}

Tan, Hongchen ^{[1
]}

Hu, Yongli ^{[1
]}

Sun, Yanfeng ^{[1
]}

Wang, Huasheng ^{[2
]}

Yin, Baocai ^{[1
]}

机构：

[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

[2] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 3AT, Wales

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 35卷 / 12期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Semantics; Image segmentation; Task analysis; Feature extraction; Visualization; Detectors; Object detection; Attention mechanism; global perception; local perception; referring image segmentation (RIS); transformer;

D O I：

10.1109/TNNLS.2023.3308550

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The effective modal fusion and perception between the language and the image are necessary for inferring the reference instance in the referring image segmentation (RIS) task. In this article, we propose a novel RIS network, the global and local interactive perception network (GLIPN), to enhance the quality of modal fusion between the language and the image from the local and global perspectives. The core of GLIPN is the global and local interactive perception (GLIP) scheme. Specifically, the GLIP scheme contains the local perception module (LPM) and the global perception module (GPM). The LPM is designed to enhance the local modal fusion by the correspondence between word and image local semantics. The GPM is designed to inject the global structured semantics of images into the modal fusion process, which can better guide the word embedding to perceive the whole image's global structure. Combined with the local-global context semantics fusion, extensive experiments on several benchmark datasets demonstrate the advantage of the proposed GLIPN over most state-of-the-art approaches.

引用

下载

页码：1 / 14

页数：14

共 50 条

[1] Global Selection and Local Attention Network for Referring Image Segmentation
Ding, Haixin
Zhang, Shengchuan
Cao, Liujuan
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 284 - 295
[2] Local-global coordination with transformers for referring image segmentation
Liu, Fang
Kong, Yuqiu
Zhang, Lihe
Feng, Guang
Yin, Baocai
NEUROCOMPUTING, 2023, 522 : 39 - 52
[3] Structured Attention Network for Referring Image Segmentation
Lin, Liang
Yan, Pengxiang
Xu, Xiaoqian
Yang, Sibei
Zeng, Kun
Li, Guanbin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
[4] Zero-shot Referring Image Segmentation with Global-Local Context Features
Yu, Seonghoon
Seo, Paul Hongsuck
Son, Jeany
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19456 - 19465
[5] Interactive image segmentation combining global seeding and sparse local reconstruction
Jianwu Long
Yuanqin Liu
Kaixin Zhang
Shuang Chen
Qi Luo
Pattern Analysis and Applications, 2025, 28 (2)
[6] Dual Convolutional LSTM Network for Referring Image Segmentation
Ye, Linwei
Liu, Zhi
Wang, Yang
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3224 - 3235
[7] Structured Multimodal Fusion Network for Referring Image Segmentation
Xue, Mingcheng
Liu, Yu
Xu, Kaiping
Zhang, Haiyang
Yu, Chengyang
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 36 - 47
[8] A CONTEXT-BASED NETWORK FOR REFERRING IMAGE SEGMENTATION
Li, Xinyu
Liu, Yu
Xu, Kaiping
Zhao, Zhehuan
Liu, Sipei
2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1436 - 1440
[9] Query Reconstruction Network for Referring Expression Image Segmentation
Shi, Hengcan
Li, Hongliang
Wu, Qingbo
Ngan, King Ngi
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 995 - 1007
[10] Bilateral Knowledge Interaction Network for Referring Image Segmentation
Ding, Haixin
Zhang, Shengchuan
Wu, Qiong
Yu, Songlin
Hu, Jie
Cao, Liujuan
Ji, Rongrong
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2966 - 2977

← 1 2 3 4 5 →