Joint Scence Network and Attention-Guided for Image Captioning

被引:2
|
作者
Zhou, Dongming [1 ]
Yang, Jing [1 ]
Zhang, Canlong [1 ]
Tang, Yanping [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541000, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; Attention Network; Graph Convolutional Network; Machine Learning;
D O I
10.1109/ICDM51629.2021.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is an interesting and challenging task. The previously established image captioning approach is based mainly on the encoder-decoder architecture, but it suffers from problems such as inaccurate captioning information, and the generated captioning sentences are not sufficiently rich. This paper proposes a novel image captioning model that is based on a self-attention network and a scene graph relationship network. First, an improved self-attention network is added to the extraction of visual features to evaluate the effectiveness of image global information for image generation. Then, we design a visual intensity parameter to coordinate the strategies of visual features and language model for word generation. Finally, a graph convolutional network is designed to extract the relationships from the scene information to render the generated caption more exciting and to increase the accuracy of the fine-grained captioning . We demonstrated the satisfactory performance of the model on the MS-COCO and Flickr 30K datasets. The experimental results demonstrate that the proposed model realizes state-of-the-art performance.
引用
下载
收藏
页码:1535 / 1540
页数:6
相关论文
共 50 条
  • [21] ATTENTION-GUIDED COST VOLUME REFINEMENT NETWORK FOR SATELLITE STEREO IMAGE MATCHING
    Jeong, W. J.
    Park, S. Y.
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1045 - 1050
  • [22] Dual Attention-Guided Detail and Structure Information Fusion Network for Image Dehazing
    Gao J.-R.
    Li H.-F.
    Zhang Y.-F.
    Xie M.-H.
    Li F.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (01): : 160 - 171
  • [23] Deep Attention-Guided Spatial-Spectral Network for Hyperspectral Image Unmixing
    Qi, Lin
    Yue, Mengyi
    Gao, Feng
    Cao, Bing
    Dong, Junyu
    Gao, Xinbo
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [24] Attention-guided Unified Network for Panoptic Segmentation
    Li, Yanwei
    Chen, Xinze
    Zhu, Zheng
    Xie, Lingxi
    Huang, Guan
    Du, Dalong
    Wang, Xingang
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7019 - 7028
  • [25] Multiscale Attention-Guided Panoptic Segmentation Network
    Fu, Du
    Qu, Shaojun
    Fu, Ya
    Computer Engineering and Applications, 2023, 59 (22) : 223 - 232
  • [26] Attention-guided feature fusion and joint learning for remote sensing image scene classification
    Yu D.
    Xu Q.
    Zhao C.
    Guo H.
    Lu J.
    Lin Y.
    Liu X.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2023, 52 (04): : 624 - 637
  • [27] Attention-guided aggregation stereo matching network
    Zhang, Yaru
    Li, Yaqian
    Wu, Chao
    Liu, Bin
    IMAGE AND VISION COMPUTING, 2021, 106
  • [28] Attention-Guided Network for Semantic Video Segmentation
    Li, Jiangyun
    Zhao, Yikai
    Fu, Jun
    Wu, Jiajia
    Liu, Jing
    IEEE ACCESS, 2019, 7 : 140680 - 140689
  • [29] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [30] AMFNet: An attention-guided generative adversarial network for multi-model image fusion
    Wang, Jing
    Yu, Long
    Tian, Shengwei
    Wu, Weidong
    Zhang, Dezhi
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78