Joint Scence Network and Attention-Guided for Image Captioning

被引:2
|
作者
Zhou, Dongming [1 ]
Yang, Jing [1 ]
Zhang, Canlong [1 ]
Tang, Yanping [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541000, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; Attention Network; Graph Convolutional Network; Machine Learning;
D O I
10.1109/ICDM51629.2021.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is an interesting and challenging task. The previously established image captioning approach is based mainly on the encoder-decoder architecture, but it suffers from problems such as inaccurate captioning information, and the generated captioning sentences are not sufficiently rich. This paper proposes a novel image captioning model that is based on a self-attention network and a scene graph relationship network. First, an improved self-attention network is added to the extraction of visual features to evaluate the effectiveness of image global information for image generation. Then, we design a visual intensity parameter to coordinate the strategies of visual features and language model for word generation. Finally, a graph convolutional network is designed to extract the relationships from the scene information to render the generated caption more exciting and to increase the accuracy of the fine-grained captioning . We demonstrated the satisfactory performance of the model on the MS-COCO and Flickr 30K datasets. The experimental results demonstrate that the proposed model realizes state-of-the-art performance.
引用
收藏
页码:1535 / 1540
页数:6
相关论文
共 50 条
  • [41] SiamAGN: Siamese attention-guided network for visual tracking
    Wei, Bingbing
    Chen, Hongyu
    Ding, Qinghai
    Luo, Haibo
    NEUROCOMPUTING, 2022, 512 : 69 - 82
  • [42] Variational joint self-attention for image captioning
    Shao, Xiangjun
    Xiang, Zhenglong
    Li, Yuanxiang
    Zhang, Mingjie
    IET IMAGE PROCESSING, 2022, 16 (08) : 2075 - 2086
  • [43] Object Detection by Attention-Guided Feature Fusion Network
    Shi, Yuxuan
    Fan, Yue
    Xu, Siqi
    Gao, Yue
    Gao, Ran
    SYMMETRY-BASEL, 2022, 14 (05):
  • [44] Mask-guided network for image captioning
    Lim, Jian Han
    Chan, Chee Seng
    PATTERN RECOGNITION LETTERS, 2023, 173 : 79 - 86
  • [45] Attention-guided Low light enhancement CNN Network
    Liang, Xiwen
    Yan, Xiaoning
    Xu, Nenghua
    Chen, Xiaoyan
    Feng, Hao
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2023, 9 (04):
  • [46] Attention-guided Progressive Partition Network for Human Parsing
    Huang, Xi
    He, Chengkun
    Shao, Jie
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [47] Attention-Guided Region Proposal Network for Pedestrian Detection
    Sun, Rui
    Wang, Huihui
    Zhang, Jun
    Zhang, Xudong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (10) : 2072 - 2076
  • [48] Dense Dilated Multi-Scale Supervised Attention-Guided Network for histopathology image segmentation
    Das, Rangan
    Bose, Shirsha
    Chowdhury, Ritesh Sur
    Maulik, Ujjwal
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 163
  • [49] DCACNet: Dual context aggregation and attention-guided cross deconvolution network for medical image segmentation
    Lu, Hongchun
    Tian, Shengwei
    Yu, Long
    Liu, Lu
    Cheng, Junlong
    Wu, Weidong
    Kang, Xiaojing
    Zhang, Dezhi
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 214
  • [50] Image blending-based noise synthesis and attention-guided network for single image marine snow denoising
    Zeyu Zhao
    Xiu Li
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 2205 - 2219