An Image Caption Model Based on the Scene Graph and Semantic Prior Network

被引：0

作者：

Liu, Weifeng ^{[1
]}

Zhang, Nan ^{[1
]}

Wang, Yaning ^{[2
]}

Di, Wu ^{[3
]}

机构：

[1] Shaanxi Univ Sci & Technol, Sch Elect & Control Engn, Xian, Peoples R China

[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou, Peoples R China

[3] Shaanxi Univ Chinese Med, Sch Basic Med Sci, Xianyang, Shaanxi, Peoples R China

来源：

2022 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS) | 2022年

关键词：

Image caption; scene graph; semantic prior; memory network;

D O I：

10.1109/ICCAIS56082.2022.9990458

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we propose an image caption model based on scene graphs and semantic priors to address the problem of traditional image caption models that are overly dependent on training data. First, the original image features and the scene graph features are fused by embedding the scene image features into the feature space. Second, using image captions from the existing dataset, the sentence reconstruction task is used to train the memory network to retain semantic prior knowledge. The scene graph features are then combined with semantic prior information to reconstruct the new features, which are then sent into the Decoder to produce an image caption.

引用

页码：60 / 66

页数：7

共 50 条

[1] Transformer model incorporating local graph semantic attention for image caption
Qian, Kui
Pan, Yuchen
Xu, Hao
Tian, Lei
VISUAL COMPUTER, 2024, 40 (09): : 6533 - 6544
[2] Image Caption with Prior Knowledge Graph and Heterogeneous Attention
Wang, Junjie
Huang, Wenfeng
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 344 - 356
[3] Image Captioning with Scene-graph Based Semantic Concepts
Gao, Lizhao
Wang, Bo
Wang, Wenmin
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 225 - 229
[4] An Approach to Generate a Caption for an Image Collection Using Scene Graph Generation
Phueaksri, Itthisak
Kastner, Marc A.
Kawanishi, Yasutomo
Komamizu, Takahiro
Ide, Ichiro
IEEE ACCESS, 2023, 11 : 128245 - 128260
[5] Image caption model of double LSTM with scene factors
Peng, Yuqing
Liu, Xuan
Wang, Weihua
Zhao, Xiaosong
Wei, Ming
IMAGE AND VISION COMPUTING, 2019, 86 : 38 - 44
[6] Image Quality Caption with Attentive and Recurrent Semantic Attractor Network
Yang, Wen
Wu, Jinjian
Li, Leida
Dong, Weisheng
Shi, Guangming
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4501 - 4509
[7] Aligned visual semantic scene graph for image captioning
Zhao, Shanshan
Li, Lixiang
Peng, Haipeng
DISPLAYS, 2022, 74
[8] Scene Graph Semantic Inference for Image and Text Matching
Pei, Jiaming
Zhong, Kaiyang
Yu, Zhi
Wang, Lukun
Lakshmanna, Kuruva
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (05)
[9] Scene Graph based Fusion Network for Image-Text Retrieval
Wang, Guoliang
Shang, Yanlei
Chen, Yong
Zhen, Chaoqi
Cheng, Dequan
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
[10] VMSG: a video caption network based on multimodal semantic grouping and semantic attention
Yang, Xin
Wang, Xiangchen
Ye, Xiaohui
Li, Tao
MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2575 - 2589

← 1 2 3 4 5 →