Image Caption via Visual Attention Switch on DenseNet

被引：0

作者：

Hao, Yanlong ^{[1
]}

Xie, Jiyang ^{[1
]}

Lin, Zhiqing ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China

来源：

PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC) | 2018年

关键词：

Image caption; Visual attention switch; Encoder-decoder architecture; DenseNet; LSTM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed densely convolutional neural network (DenseNet) to extract the feature maps. Meanwhile, the decoder uses the long short time memory (LSTM) to parse the feature maps to descriptions. We predict the next word of descriptions by taking the effective combination of feature maps with word embedding of current input word by "visual attention switch". Finally, we compare the performance of the proposed model with other baseline models and achieve good results.

引用

页码：334 / 338

页数：5

共 50 条

[41] Assamese news image caption generation using attention mechanism
Ringki Das
Thoudam Doren Singh
[J]. Multimedia Tools and Applications, 2022, 81 : 10051 - 10069
[42] Visual Image Caption Generation for Service Robotics and Industrial Applications
Luo, Ren C.
Hsu, Yu-Ting
Wen, Yu-Cheng
Ye, Huan-Jun
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER PHYSICAL SYSTEMS (ICPS 2019), 2019, : 827 - 832
[43] VSAM-Based Visual Keyword Generation for Image Caption
Zhang, Suya
Zhang, Yana
Chen, Zeyu
Li, Zhaohui
[J]. IEEE ACCESS, 2021, 9 : 27638 - 27649
[44] Remote Sensing Image Caption Method Based on Attention and Reinforcement Learning
Nong Yuanjun
Wang Junjie
[J]. ACTA OPTICA SINICA, 2021, 41 (22)
[45] Leveraging Visual Question Answering for Image-Caption Ranking
Lin, Xiao
Parikh, Devi
[J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 261 - 277
[46] Neural Image Caption Generation with Global Feature Based Attention Scheme
Wang, Yongzhuang
Xiong, Hongkai
[J]. IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 51 - 61
[47] A Deep Attention based Framework for Image Caption Generation in Hindi Language
Dhir, Rijul
Mishra, Santosh Kumar
Saha, Sriparna
Bhattacharyya, Pushpak
[J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 693 - 701
[48] Image Caption Description Generation Method Based on Reflective Attention Mechanism
Qiao Pingan
Yuan, Li
Shen Ruixue
[J]. ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 600 - 609
[49] Remote Sensing Image Caption Method Based on Attention and Reinforcement Learning
Nong Y.
Wang J.
[J]. Guangxue Xuebao/Acta Optica Sinica, 2021, 41 (22):
[50] Transformer model incorporating local graph semantic attention for image caption
Qian, Kui
Pan, Yuchen
Xu, Hao
Tian, Lei
[J]. VISUAL COMPUTER, 2024, 40 (09): : 6533 - 6544

← 1 2 3 4 5 →