A cooperative approach based on self-attention with interactive attribute for image caption

被引：10

作者：

Zhao, Dexin ^{[1
]}

Yang, Ruixue ^{[1
]}

Wang, Zhaohui ^{[1
]}

Qi, Zhiyang ^{[1
]}

机构：

[1] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, Tianjin 300384, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 01期

关键词：

Image caption; Deep neural network; Self-attention;

D O I：

10.1007/s11042-022-13279-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image caption is a challenging issue in the area of image understanding, in which most of the models are trained by the framework combined a deep convolutional neural network with a recurrent neural network. However, the features extracted by the convolutional neural network could capture the information of salient regions, which fails to cover the details in the image. Moreover, the gradient vanishing problem of the recurrent neural networks would cause the loss of the previous information as the time step grows. In this paper, Cooperative Self-Attention (CSA) is proposed address these problems. Comparing with existing methods, our model enhances the representation of the image by fusing the additional attribute information from the object detection. A sub-module named Inter-Attribute indicating the interaction of objects is proposed to strengthen the context of the entities. In virtue of the advantages of Self-Attention, different from previous methods that predict the next word based on one prior word and hidden state, our model concatenates all of the words generated step by step to solve long-term dependencies. Comparing with published state-of-the-art methods, our CSA demonstrates outstanding performance.

引用

页码：1223 / 1236

页数：14

共 50 条

[1] A cooperative approach based on self-attention with interactive attribute for image caption
Dexin Zhao
Ruixue Yang
Zhaohui Wang
Zhiyang Qi
Multimedia Tools and Applications, 2023, 82 : 1223 - 1236
[2] A Novel Cross Channel Self-Attention based Approach for Facial Attribute Editing
Xu, Meng
Jin, Rize
Lu, Liangfu
Chung, Tae-Sun
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (06) : 2115 - 2127
[3] Caption Generation Based on Emotions Using CSPDenseNet and BiLSTM with Self-Attention
Priya, S. Kavi
Karthika, Pon K.
Kaliappan, Jayakumar
Selvaraj, Senthil Kumaran
Nagalakshmi, R.
Molla, Baye
APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2022, 2022
[4] Pedestrian Attribute Recognition Based on Dual Self-attention Mechanism
Fan, Zhongkui
Guan, Ye-peng
COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (02) : 793 - 812
[5] Infrared and Visible Image Fusion Method via Interactive Self-attention
Yang Fan
Wang Zhishe
Sun Jing
Yu Zhaofa
ACTA PHOTONICA SINICA, 2024, 53 (06)
[6] ISANet: An Interactive Self-attention Network for Cropland Image Change Detection
Dong, Sijun
Chen, Yanrui
Wu, Fann
Gao, Zhi
Meng, Xiaoliang
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 862 - 867
[7] A Dual Self-Attention based Network for Image Captioning
Li, ZhiYong
Yang, JinFu
Li, YaPing
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1590 - 1595
[8] Local Attribute Attention Network for Minority Clothing Image Caption Generation
Xuhui Z.
Li L.
Xiaodong F.
Lijun L.
Wei P.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (03): : 399 - 412
[9] Clothes image caption generation with attribute detection and visual attention model
Li, Xianrui
Ye, Zhiling
Zhang, Zhao
Zhao, Mingbo
PATTERN RECOGNITION LETTERS, 2021, 141 (141) : 68 - 74
[10] Research of Self-Attention in Image Segmentation
Cao, Fude
Zheng, Chunguang
Huang, Limin
Wang, Aihua
Zhang, Jiong
Zhou, Feng
Ju, Haoxue
Guo, Haitao
Du, Yuxia
JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)

← 1 2 3 4 5 →