A cooperative approach based on self-attention with interactive attribute for image caption

被引:10
|
作者
Zhao, Dexin [1 ]
Yang, Ruixue [1 ]
Wang, Zhaohui [1 ]
Qi, Zhiyang [1 ]
机构
[1] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, Tianjin 300384, Peoples R China
关键词
Image caption; Deep neural network; Self-attention;
D O I
10.1007/s11042-022-13279-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image caption is a challenging issue in the area of image understanding, in which most of the models are trained by the framework combined a deep convolutional neural network with a recurrent neural network. However, the features extracted by the convolutional neural network could capture the information of salient regions, which fails to cover the details in the image. Moreover, the gradient vanishing problem of the recurrent neural networks would cause the loss of the previous information as the time step grows. In this paper, Cooperative Self-Attention (CSA) is proposed address these problems. Comparing with existing methods, our model enhances the representation of the image by fusing the additional attribute information from the object detection. A sub-module named Inter-Attribute indicating the interaction of objects is proposed to strengthen the context of the entities. In virtue of the advantages of Self-Attention, different from previous methods that predict the next word based on one prior word and hidden state, our model concatenates all of the words generated step by step to solve long-term dependencies. Comparing with published state-of-the-art methods, our CSA demonstrates outstanding performance.
引用
收藏
页码:1223 / 1236
页数:14
相关论文
共 50 条
  • [41] NATURAL IMAGE MATTING WITH SHIFTED WINDOW SELF-ATTENTION
    Wang, Zhikun
    Liu, Yang
    Li, Zonglin
    Wang, Chenyang
    Zhang, Shengping
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2911 - 2915
  • [42] A feature detection network based on self-attention mechanism for underwater image processing
    Wu, Di
    Su, Boxun
    Hao, Lichao
    Wang, Ye
    Zhang, Liukun
    Yan, Zheping
    OCEAN ENGINEERING, 2024, 311
  • [43] IDTransformer: Infrared image denoising method based on convolutional transposed self-attention
    Shen, Zhengwei
    Qin, Feiwei
    Ge, Ruiquan
    Wang, Changmiao
    Zhang, Kai
    Huang, Jie
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 110 : 310 - 321
  • [44] USuperGlue: an unsupervised UAV image matching network based on local self-attention
    Zhou, Yatong
    Guo, Ya
    Lin, Kuo-Ping
    Yang, Fan
    Li, Lingling
    SOFT COMPUTING, 2023, 28 (15-16) : 8889 - 8909
  • [45] Remote Sensing Image Scene Classification Based on Global Self-Attention Module
    Li, Qingwen
    Yan, Dongmei
    Wu, Wanrong
    REMOTE SENSING, 2021, 13 (22)
  • [46] Image Deblurring Algorithm Incorporating Self-Attention Mechanism
    Yu, Tingting
    Lv, Qiang
    Huang, Zhen
    Su, Zhang
    Wang, Xiangli
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2025,
  • [47] PSNet: Towards Efficient Image Restoration With Self-Attention
    Cui, Yuning
    Knoll, Alois
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (09) : 5735 - 5742
  • [48] Studying the Effects of Self-Attention for Medical Image Analysis
    Rao, Adrit
    Park, Jongchan
    Woo, Sanghyun
    Lee, Joon-Young
    Aalami, Oliver
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3409 - 3418
  • [49] Compressive sensing image reconstruction based on deep unfolding self-attention network
    Tian, Jin-Peng
    Hou, Bao-Jun
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (10): : 3018 - 3026
  • [50] TransForensics: Image Forgery Localization with Dense Self-Attention
    Hao, Jing
    Zhang, Zhixin
    Yang, Shicai
    Xie, Di
    Pu, Shiliang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15035 - 15044