Learning visual relationship and context-aware attention for image captioning

被引：107

作者：

Wang, Junbo ^{[1
,3
]}

Wang, Wei ^{[1
,3
]}

Wang, Liang ^{[1
,2
,3
]}

Wang, Zhiyong ^{[4
]}

Feng, David Dagan ^{[4
]}

Tan, Tieniu ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci CASIA, Inst Automat, Natl Lab Pattern Recognit NLPR, CRIPAC, Beijing, Peoples R China

[2] CASIA, CEBSIT, Beijing, Peoples R China

[3] UCAS, Beijing, Peoples R China

[4] Univ Sydney, Sch Informat Technol, Sydney, NSW, Australia

来源：

PATTERN RECOGNITION | 2020年 / 98卷

基金：

中国国家自然科学基金; 澳大利亚研究理事会;

关键词：

Image captioning; Relational reasoning; Context-aware attention; RECOGNITION;

D O I：

10.1016/j.patcog.2019.107075

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image captioning which automatically generates natural language descriptions for images has attracted lots of research attentions and there have been substantial progresses with attention based captioning methods. However, most attention-based image captioning methods focus on extracting visual information in regions of interest for sentence generation and usually ignore the relational reasoning among those regions of interest in an image. Moreover, these methods do not take into account previously attended regions which can be used to guide the subsequent attention selection. In this paper, we propose a novel method to implicitly model the relationship among regions of interest in an image with a graph neural network, as well as a novel context-aware attention mechanism to guide attention selection by fully memorizing previously attended visual content. Compared with the existing attention-based image captioning methods, ours can not only learn relation-aware visual representations for image captioning, but also consider historical context information on previous attention. We perform extensive experiments on two public benchmark datasets: MS COCO and Flickr30K, and the experimental results indicate that our proposed method is able to outperform various state-of-the-art methods in terms of the widely used evaluation metrics. (C) 2019 Elsevier Ltd. All rights reserved.

引用

页数：11

共 50 条

[1] Context-aware transformer for image captioning
Yang, Xin
Wang, Ying
Chen, Haishun
Li, Jie
Huang, Tingting
NEUROCOMPUTING, 2023, 549
[2] Image Captioning with Context-Aware Auxiliary Guidance
Song, Zeliang
Zhou, Xiaofei
Mao, Zhendong
Tan, Jianlong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2584 - 2592
[3] Context-aware and co-attention network based image captioning model
Sharma, Himanshu
Srivastava, Swati
IMAGING SCIENCE JOURNAL, 2023, 71 (03): : 244 - 256
[4] Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Liu, Daqing
Zha, Zheng-Jun
Zhang, Hanwang
Zhang, Yongdong
Wu, Feng
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1416 - 1424
[5] Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zha, Zheng-Jun
Liu, Daqing
Zhang, Hanwang
Zhang, Yongdong
Wu, Feng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 710 - 722
[6] Visual Relationship Attention for Image Captioning
Zhang, Zongjian
Wu, Qiang
Wang, Yang
Chen, Fang
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[7] Meshed Context-Aware Beam Search for Image Captioning
Zhao, Fengzhi
Yu, Zhezhou
Wang, Tao
Zhao, He
ENTROPY, 2024, 26 (10)
[8] Stacked Multimodal Attention Network for Context-Aware Video Captioning
Zheng, Yi
Zhang, Yuejie
Feng, Rui
Zhang, Tao
Fan, Weiguo
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (01) : 31 - 42
[9] Context-aware attention network for image recognition
Jiaxu Leng
Ying Liu
Shang Chen
Neural Computing and Applications, 2019, 31 : 9295 - 9305
[10] Context-aware attention network for image recognition
Leng, Jiaxu
Liu, Ying
Chen, Shang
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (12): : 9295 - 9305

← 1 2 3 4 5 →