Global-local feature attention network with reranking strategy for image caption generation

被引:2
|
作者
Wu J. [1 ]
Xie S.-Y. [1 ]
Shi X.-B. [1 ]
Chen Y.-W. [2 ]
机构
[1] College of Engineering, Shantou University, Shantou
[2] Key Laboratory of Digital Signal and Image Processing of Guangdong, Shantou University, Shantou
关键词
A;
D O I
10.1007/s11801-017-7185-4
中图分类号
学科分类号
摘要
In this paper, a novel framework, named as global-local feature attention network with reranking strategy (GLAN-RS), is presented for image captioning task. Rather than only adopting unitary visual information in the classical models, GLAN-RS explores the attention mechanism to capture local convolutional salient image maps. Furthermore, we adopt reranking strategy to adjust the priority of the candidate captions and select the best one. The proposed model is verified using the Microsoft Common Objects in Context (MSCOCO) benchmark dataset across seven standard evaluation metrics. Experimental results show that GLAN-RS significantly outperforms the state-of-the-art approaches, such as multimodal recurrent neural network (MRNN) and Google NIC, which gets an improvement of 20% in terms of BLEU4 score and 13 points in terms of CIDER score. © 2017, Tianjin University of Technology and Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:448 / 451
页数:3
相关论文
共 50 条
  • [1] Global-Local Feature Attention Network with Reranking Strategy for Image Caption Generation
    Wu, Jie
    Xie, Siya
    Shi, Xinbao
    Chen, Yaowen
    [J]. COMPUTER VISION, PT I, 2017, 771 : 157 - 167
  • [2] Global-local feature attention network with reranking strategy for image caption generation
    吴捷
    谢斯雅
    史新宝
    陈耀文
    [J]. Optoelectronics Letters, 2017, 13 (06) : 448 - 451
  • [3] Image Caption with Global-Local Attention
    Li, Linghui
    Tang, Sheng
    Deng, Lixi
    Zhang, Yongdong
    Tian, Qi
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4133 - 4139
  • [4] Neural Image Caption Generation with Global Feature Based Attention Scheme
    Wang, Yongzhuang
    Xiong, Hongkai
    [J]. IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 51 - 61
  • [5] Image captioning based on global-local feature and adaptive-attention
    Zhao, Xiao-Hu
    Yin, Liang-Fei
    Zhao, Cheng-Long
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (01): : 126 - 134
  • [6] A global-local feature adaptive fusion network for image scene classification
    Lv, Guangrui
    Dong, Lili
    Zhang, Wenwen
    Xu, Wenhai
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 6521 - 6554
  • [7] A global-local feature adaptive fusion network for image scene classification
    Guangrui Lv
    Lili Dong
    Wenwen Zhang
    Wenhai Xu
    [J]. Multimedia Tools and Applications, 2024, 83 : 6521 - 6554
  • [8] GLA: Global-Local Attention for Image Description
    Li, Linghui
    Tang, Sheng
    Zhang, Yongdong
    Deng, Lixi
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (03) : 726 - 737
  • [9] Image Caption Generation with Local Semantic and Global Information
    Liu, Xing
    Liu, Weibin
    Xing, Weiwei
    [J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 680 - 685
  • [10] Global-Local Channel Attention for Hyperspectral Image Classification
    Yan, Peilin
    Qin, Haolin
    Wang, Jihui
    Xu, Tingfa
    Song, Liqiang
    Li, Hui
    Li, Jianan
    [J]. INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 1633 - 1638