An Image Caption Model Based on the Scene Graph and Semantic Prior Network

被引:0
|
作者
Liu, Weifeng [1 ]
Zhang, Nan [1 ]
Wang, Yaning [2 ]
Di, Wu [3 ]
机构
[1] Shaanxi Univ Sci & Technol, Sch Elect & Control Engn, Xian, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou, Peoples R China
[3] Shaanxi Univ Chinese Med, Sch Basic Med Sci, Xianyang, Shaanxi, Peoples R China
关键词
Image caption; scene graph; semantic prior; memory network;
D O I
10.1109/ICCAIS56082.2022.9990458
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose an image caption model based on scene graphs and semantic priors to address the problem of traditional image caption models that are overly dependent on training data. First, the original image features and the scene graph features are fused by embedding the scene image features into the feature space. Second, using image captions from the existing dataset, the sentence reconstruction task is used to train the memory network to retain semantic prior knowledge. The scene graph features are then combined with semantic prior information to reconstruct the new features, which are then sent into the Decoder to produce an image caption.
引用
收藏
页码:60 / 66
页数:7
相关论文
共 50 条
  • [1] Transformer model incorporating local graph semantic attention for image caption
    Qian, Kui
    Pan, Yuchen
    Xu, Hao
    Tian, Lei
    VISUAL COMPUTER, 2024, 40 (09): : 6533 - 6544
  • [2] Image Caption with Prior Knowledge Graph and Heterogeneous Attention
    Wang, Junjie
    Huang, Wenfeng
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 344 - 356
  • [3] Image Captioning with Scene-graph Based Semantic Concepts
    Gao, Lizhao
    Wang, Bo
    Wang, Wenmin
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 225 - 229
  • [4] An Approach to Generate a Caption for an Image Collection Using Scene Graph Generation
    Phueaksri, Itthisak
    Kastner, Marc A.
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    IEEE ACCESS, 2023, 11 : 128245 - 128260
  • [5] Image caption model of double LSTM with scene factors
    Peng, Yuqing
    Liu, Xuan
    Wang, Weihua
    Zhao, Xiaosong
    Wei, Ming
    IMAGE AND VISION COMPUTING, 2019, 86 : 38 - 44
  • [6] Image Quality Caption with Attentive and Recurrent Semantic Attractor Network
    Yang, Wen
    Wu, Jinjian
    Li, Leida
    Dong, Weisheng
    Shi, Guangming
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4501 - 4509
  • [7] Aligned visual semantic scene graph for image captioning
    Zhao, Shanshan
    Li, Lixiang
    Peng, Haipeng
    DISPLAYS, 2022, 74
  • [8] Scene Graph Semantic Inference for Image and Text Matching
    Pei, Jiaming
    Zhong, Kaiyang
    Yu, Zhi
    Wang, Lukun
    Lakshmanna, Kuruva
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (05)
  • [9] Scene Graph based Fusion Network for Image-Text Retrieval
    Wang, Guoliang
    Shang, Yanlei
    Chen, Yong
    Zhen, Chaoqi
    Cheng, Dequan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
  • [10] VMSG: a video caption network based on multimodal semantic grouping and semantic attention
    Yang, Xin
    Wang, Xiangchen
    Ye, Xiaohui
    Li, Tao
    MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2575 - 2589