Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

被引:0
|
作者
Ji, Jiayi [1 ]
Luo, Yunpeng [1 ]
Sun, Xiaoshuai [1 ,2 ]
Chen, Fuhai [1 ]
Luo, Gen [1 ]
Wu, Yongjian [3 ]
Gao, Yue [4 ]
Ji, Rongrong [1 ,2 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Media Analyt & Comp Lab, Xiamen, Peoples R China
[2] Xiamen Univ, Inst Artificial Intelligence, Xiamen, Peoples R China
[3] Tencent Youtu Lab, Xiamen, Peoples R China
[4] Tsinghua Univ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based architectures have shown great success in image captioning, where object regions are encoded and then attended into the vectorial representations to guide the caption decoding. However, such vectorial representations only contain region-level information without considering the global information reflecting the entire image, which fails to expand the capability of complex multi-modal reasoning in image captioning. In this paper, we introduce a Global Enhanced Transformer (termed GET) to enable the extraction of a more comprehensive global representation, and then adaptively guide the decoder to generate high-quality captions. In GET, a Global Enhanced Encoder is designed for the embedding of the global feature, and a Global Adaptive Decoder are designed for the guidance of the caption generation. The former models intra- and inter-layer global representation by taking advantage of the proposed Global Enhanced Attention and a layer-wise fusion module. The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation. Extensive experiments on MS COCO dataset demonstrate the superiority of our GET over many state-of-the-arts.
引用
收藏
页码:1655 / 1663
页数:9
相关论文
共 30 条
  • [11] Ab initio study of the Fe intra- and inter-layer magnetic order in Fe/Ir(001) superlattices
    Stoeffler, D
    EUROPEAN PHYSICAL JOURNAL B, 2004, 37 (03): : 311 - 320
  • [12] On-chip intra- and inter-layer grating couplers for three-dimensional integration of silicon photonics
    Zhang, Yang
    Kwong, David
    Xu, Xiaochuan
    Hosseini, Amir
    Yang, Sang Y.
    Rogers, John A.
    Chen, Ray T.
    APPLIED PHYSICS LETTERS, 2013, 102 (21)
  • [13] Ab initio study of the Fe intra- and inter-layer magnetic order in Fe/Ir(001) superlattices
    D. Stoeffler
    The European Physical Journal B - Condensed Matter and Complex Systems, 2004, 37 : 311 - 320
  • [14] CasUNeXt: A Cascaded Transformer With Intra- and Inter-Scale Information for Medical Image Segmentation
    Sun, Junding
    Zheng, Xiaopeng
    Wu, Xiaosheng
    Tang, Chaosheng
    Wang, Shuihua
    Zhang, Yudong
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (05)
  • [15] Joint Design of Multi-resolution Codes and Intra/Inter-layer Network Coding
    Wang, Tong
    Medard, Muriel
    Zheng, Lizhong
    2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 1903 - 1907
  • [16] Dual-Modal Transformer with Enhanced Inter- and Intra-Modality Interactions for Image Captioning
    Kumar, Deepika
    Srivastava, Varun
    Popescu, Daniela Elena
    Hemanth, Jude D.
    APPLIED SCIENCES-BASEL, 2022, 12 (13):
  • [17] Enhancing weak signal propagation by intraand inter-layer global couplings in a feedforward network
    Wu, Yan
    Wu, Liqing
    Zhu, Yuan
    Yi, Ming
    Lu, Lulu
    CHAOS SOLITONS & FRACTALS, 2024, 181
  • [18] Inter-Intra Modal Representation Augmentation With DCT-Transformer Adversarial Network for Image-Text Matching
    Chen, Chen
    Wang, Dan
    Song, Bin
    Tan, Hao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8933 - 8945
  • [19] Theoretical investigation of the impact of intra- and inter-layer coupling of high TC superconductor Nd2-xCexCuO4
    Yayeh, Zewdie
    Kahsay, Gebregziabher
    Negussie, Tamiru
    NANO SELECT, 2023, 4 (11-12): : 585 - 597
  • [20] NeighborNet: Learning Intra- and Inter-Image Pixel Neighbor Representation for Breast Lesion Segmentation
    Cao, Weiwei
    Guo, Jianfeng
    You, Xiaohui
    Liu, Yuxin
    Li, Lei
    Cui, Wenju
    Cao, Yuzhu
    Chen, Xinjian
    Zheng, Jian
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (08) : 4761 - 4771