Joint Scence Network and Attention-Guided for Image Captioning

被引：2

作者：

Zhou, Dongming ^{[1
]}

Yang, Jing ^{[1
]}

Zhang, Canlong ^{[1
]}

Tang, Yanping ^{[2
]}

机构：

[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541000, Peoples R China

[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China

来源：

2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021) | 2021年

基金：

中国国家自然科学基金;

关键词：

Image captioning; Attention Network; Graph Convolutional Network; Machine Learning;

D O I：

10.1109/ICDM51629.2021.00201

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image captioning is an interesting and challenging task. The previously established image captioning approach is based mainly on the encoder-decoder architecture, but it suffers from problems such as inaccurate captioning information, and the generated captioning sentences are not sufficiently rich. This paper proposes a novel image captioning model that is based on a self-attention network and a scene graph relationship network. First, an improved self-attention network is added to the extraction of visual features to evaluate the effectiveness of image global information for image generation. Then, we design a visual intensity parameter to coordinate the strategies of visual features and language model for word generation. Finally, a graph convolutional network is designed to extract the relationships from the scene information to render the generated caption more exciting and to increase the accuracy of the fine-grained captioning . We demonstrated the satisfactory performance of the model on the MS-COCO and Flickr 30K datasets. The experimental results demonstrate that the proposed model realizes state-of-the-art performance.

引用

下载

页码：1535 / 1540

页数：6

共 50 条

[21] ATTENTION-GUIDED COST VOLUME REFINEMENT NETWORK FOR SATELLITE STEREO IMAGE MATCHING
Jeong, W. J.
Park, S. Y.
GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1045 - 1050
[22] Dual Attention-Guided Detail and Structure Information Fusion Network for Image Dehazing
Gao J.-R.
Li H.-F.
Zhang Y.-F.
Xie M.-H.
Li F.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (01): : 160 - 171
[23] Deep Attention-Guided Spatial-Spectral Network for Hyperspectral Image Unmixing
Qi, Lin
Yue, Mengyi
Gao, Feng
Cao, Bing
Dong, Junyu
Gao, Xinbo
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
[24] Attention-guided Unified Network for Panoptic Segmentation
Li, Yanwei
Chen, Xinze
Zhu, Zheng
Xie, Lingxi
Huang, Guan
Du, Dalong
Wang, Xingang
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7019 - 7028
[25] Multiscale Attention-Guided Panoptic Segmentation Network
Fu, Du
Qu, Shaojun
Fu, Ya
Computer Engineering and Applications, 2023, 59 (22) : 223 - 232
[26] Attention-guided feature fusion and joint learning for remote sensing image scene classification
Yu D.
Xu Q.
Zhao C.
Guo H.
Lu J.
Lin Y.
Liu X.
Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2023, 52 (04): : 624 - 637
[27] Attention-guided aggregation stereo matching network
Zhang, Yaru
Li, Yaqian
Wu, Chao
Liu, Bin
IMAGE AND VISION COMPUTING, 2021, 106
[28] Attention-Guided Network for Semantic Video Segmentation
Li, Jiangyun
Zhao, Yikai
Fu, Jun
Wu, Jiajia
Liu, Jing
IEEE ACCESS, 2019, 7 : 140680 - 140689
[29] Text-Guided Attention Model for Image Captioning
Mun, Jonghwan
Cho, Minsu
Han, Bohyung
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
[30] AMFNet: An attention-guided generative adversarial network for multi-model image fusion
Wang, Jing
Yu, Long
Tian, Shengwei
Wu, Weidong
Zhang, Dezhi
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78

← 1 2 3 4 5 →