Image Caption via Visual Attention Switch on DenseNet

被引:0
|
作者
Hao, Yanlong [1 ]
Xie, Jiyang [1 ]
Lin, Zhiqing [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China
关键词
Image caption; Visual attention switch; Encoder-decoder architecture; DenseNet; LSTM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed densely convolutional neural network (DenseNet) to extract the feature maps. Meanwhile, the decoder uses the long short time memory (LSTM) to parse the feature maps to descriptions. We predict the next word of descriptions by taking the effective combination of feature maps with word embedding of current input word by "visual attention switch". Finally, we compare the performance of the proposed model with other baseline models and achieve good results.
引用
收藏
页码:334 / 338
页数:5
相关论文
共 50 条
  • [41] Assamese news image caption generation using attention mechanism
    Ringki Das
    Thoudam Doren Singh
    [J]. Multimedia Tools and Applications, 2022, 81 : 10051 - 10069
  • [42] Visual Image Caption Generation for Service Robotics and Industrial Applications
    Luo, Ren C.
    Hsu, Yu-Ting
    Wen, Yu-Cheng
    Ye, Huan-Jun
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER PHYSICAL SYSTEMS (ICPS 2019), 2019, : 827 - 832
  • [43] VSAM-Based Visual Keyword Generation for Image Caption
    Zhang, Suya
    Zhang, Yana
    Chen, Zeyu
    Li, Zhaohui
    [J]. IEEE ACCESS, 2021, 9 : 27638 - 27649
  • [44] Remote Sensing Image Caption Method Based on Attention and Reinforcement Learning
    Nong Yuanjun
    Wang Junjie
    [J]. ACTA OPTICA SINICA, 2021, 41 (22)
  • [45] Leveraging Visual Question Answering for Image-Caption Ranking
    Lin, Xiao
    Parikh, Devi
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 261 - 277
  • [46] Neural Image Caption Generation with Global Feature Based Attention Scheme
    Wang, Yongzhuang
    Xiong, Hongkai
    [J]. IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 51 - 61
  • [47] A Deep Attention based Framework for Image Caption Generation in Hindi Language
    Dhir, Rijul
    Mishra, Santosh Kumar
    Saha, Sriparna
    Bhattacharyya, Pushpak
    [J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 693 - 701
  • [48] Image Caption Description Generation Method Based on Reflective Attention Mechanism
    Qiao Pingan
    Yuan, Li
    Shen Ruixue
    [J]. ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 600 - 609
  • [49] Remote Sensing Image Caption Method Based on Attention and Reinforcement Learning
    Nong Y.
    Wang J.
    [J]. Guangxue Xuebao/Acta Optica Sinica, 2021, 41 (22):
  • [50] Transformer model incorporating local graph semantic attention for image caption
    Qian, Kui
    Pan, Yuchen
    Xu, Hao
    Tian, Lei
    [J]. VISUAL COMPUTER, 2024, 40 (09): : 6533 - 6544