Multiscale deep feature selection fusion network for referring image segmentation

被引:0
|
作者
Xianwen Dai
Jiacheng Lin
Ke Nai
Qingpeng Li
Zhiyong Li
机构
[1] Hunan University,College of Computer Science and Electronic Engineering
[2] Hunan University,School of Robotics
来源
关键词
Referring image segmentation; Semantic segmentation; Multi-modal fusion; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Referring image segmentation has attracted extensive attention in recent years. Previous methods have explored the difficult alignment between visual and textual features, but this problem has not been effectively addressed. This leads to the problem of insufficient interaction between visual features and textual features, which affects model performance. To this end, we propose a language-aware pixel feature fusion module (LPFFM) based on self-attention mechanism to ensure that the features of the two modalities have sufficient interaction in the space and channels. Then we apply it in the shallow to deep layers of the encoder to gradually select visual features related to the text. Secondly, we propose a second selection mechanism to further select visual features that only contain the target. For this mechanism, we design an attention contrastive loss to better suppress irrelevant background information. Further, we propose a multi-scale deep features selection fusion network (MDSFNet) based on the U-net architecture. Finally, the experimental results show that our proposed method is competitive with previous methods, improving the performance by 2.87%, 3.17%, and 3.81% on three benchmark datasets, RefCOCO, RefCOCO+, and G-ref, respectively.
引用
收藏
页码:36287 / 36305
页数:18
相关论文
共 50 条
  • [1] Multiscale deep feature selection fusion network for referring image segmentation
    Dai, Xianwen
    Lin, Jiacheng
    Nai, Ke
    Li, Qingpeng
    Li, Zhiyong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 36287 - 36305
  • [3] Structured Multimodal Fusion Network for Referring Image Segmentation
    Xue, Mingcheng
    Liu, Yu
    Xu, Kaiping
    Zhang, Haiyang
    Yu, Chengyang
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 36 - 47
  • [4] Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion
    Zhang, Guangzhen
    Jiang, Wangyang
    [J]. INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2023, 19 (01)
  • [5] Convolutional Neural Network-Based Multiscale Feature Selection and Evaluation in Image Segmentation
    Cao, Di
    Cao, Jian-Nong
    Deng, Liang
    Lou, Li-Ping
    [J]. IEEE ACCESS, 2024, 12 : 68003 - 68014
  • [6] Global Selection and Local Attention Network for Referring Image Segmentation
    Ding, Haixin
    Zhang, Shengchuan
    Cao, Liujuan
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 284 - 295
  • [7] Image semantic segmentation with hierarchical feature fusion based on deep neural network
    Yang, Dawei
    Du, Yan
    Yao, Hongli
    Bao, Liyan
    [J]. CONNECTION SCIENCE, 2022, 34 (01) : 1772 - 1784
  • [8] Multiscale feature fusion deep network for single image dehazing with continuous memory mechanism
    Xie, Zhihua
    Li, Qiang
    Zong, Sha
    Liu, Guodong
    [J]. Optik, 2023, 287
  • [9] Multiscale Feature Interactive Network for Multifocus Image Fusion
    Liu, Yu
    Wang, Lei
    Cheng, Juan
    Chen, Xun
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [10] Multiscale Feature Interactive Network for Multifocus Image Fusion
    Liu, Yu
    Wang, Lei
    Cheng, Juan
    Chen, Xun
    [J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70