Joint Scence Network and Attention-Guided for Image Captioning

被引:2
|
作者
Zhou, Dongming [1 ]
Yang, Jing [1 ]
Zhang, Canlong [1 ]
Tang, Yanping [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541000, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; Attention Network; Graph Convolutional Network; Machine Learning;
D O I
10.1109/ICDM51629.2021.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is an interesting and challenging task. The previously established image captioning approach is based mainly on the encoder-decoder architecture, but it suffers from problems such as inaccurate captioning information, and the generated captioning sentences are not sufficiently rich. This paper proposes a novel image captioning model that is based on a self-attention network and a scene graph relationship network. First, an improved self-attention network is added to the extraction of visual features to evaluate the effectiveness of image global information for image generation. Then, we design a visual intensity parameter to coordinate the strategies of visual features and language model for word generation. Finally, a graph convolutional network is designed to extract the relationships from the scene information to render the generated caption more exciting and to increase the accuracy of the fine-grained captioning . We demonstrated the satisfactory performance of the model on the MS-COCO and Flickr 30K datasets. The experimental results demonstrate that the proposed model realizes state-of-the-art performance.
引用
收藏
页码:1535 / 1540
页数:6
相关论文
共 50 条
  • [1] Attention-Guided Image Captioning through Word Information
    Tang, Ziwei
    Yi, Yaohua
    Sheng, Hao
    [J]. SENSORS, 2021, 21 (23)
  • [2] Attention-guided image captioning with adaptive global and local feature fusion
    Zhong, Xian
    Nie, Guozhang
    Huang, Wenxin
    Liu, Wenxuan
    Ma, Bo
    Lin, Chia-Wen
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [3] Connecting Language to Images: A Progressive Attention-Guided Network for Simultaneous Image Captioning and Language Grounding
    Song, Lingyun
    Liu, Jun
    Qian, Buyue
    Chen, Yihe
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8885 - 8892
  • [4] Learning joint relationship attention network for image captioning
    Wang, Changzhi
    Gu, Xiaodong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 211
  • [5] Lightweight Attention-Guided Network for Image Super-Resolution
    Ding, Zixuan
    Juan, Zhang
    Xiang, Li
    Wang, Xinyu
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (14)
  • [6] Attention-Guided Hierarchical Parsing for Fine-Grained Person-Centric Image Captioning
    Gu, Zhengcheng
    Jin, Jing
    [J]. IEEE ACCESS, 2024, 12 : 86293 - 86301
  • [7] Attention-guided CNN for image denoising
    Tian, Chunwei
    Xu, Yong
    Li, Zuoyong
    Zuo, Wangmeng
    Fei, Lunke
    Liu, Hong
    [J]. NEURAL NETWORKS, 2020, 124 : 117 - 129
  • [8] Attention-Guided Network Model for Image-Based Emotion Recognition
    Arabian, Herag
    Battistel, Alberto
    Chase, J. Geoffrey
    Moeller, Knut
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [9] Semantic Attention-guided Day-to-Night Image Translation Network
    Bang, Geonkyu
    Lee, Jinho
    Endo, Yuki
    Nishimori, Toshiaki
    Nakao, Kenta
    Kamijo, Shunsuke
    [J]. 2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 2404 - 2409
  • [10] Context-wise attention-guided network for single image deraining
    Fu, Bo
    Jiang, Yong
    Wang, Hongguang
    Wang, Qiang
    Gao, Qian
    Tang, Yandong
    [J]. ELECTRONICS LETTERS, 2022, 58 (04) : 148 - 150