MIXED KNOWLEDGE RELATION TRANSFORMER FOR IMAGE CAPTIONING

被引:0
|
作者
Chen, Tianyu [1 ]
Li, Zhixin [1 ]
Wei, Jiahui [1 ]
Xian, Tiantao [1 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
image captioning; external knowledge; object relation; LANGUAGE;
D O I
10.1109/ICASSP43922.2022.9747541
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Internal relationship of image objects has contributed significantly to the development of image captioning, especially when combined with Transformer architecture. Most of these methods only calculate the relationship between entities and ignore the information between entities and background. Besides, the way of exploring the relational information inside the image can also be extended. In this paper, we continually explore the relationship between objects from both internal and external perspectives, and embed the vital image global information into the internal relationship module. To validate the effectiveness of our model, we conduct extensive experiments on the most popular MSCOCO dataset, and achieve state-of-the-art performance on both online and offline test sets.
引用
收藏
页码:4403 / 4407
页数:5
相关论文
共 50 条
  • [1] Direction Relation Transformer for Image Captioning
    Song, Zeliang
    Zhou, Xiaofei
    Dong, Linhua
    Tan, Jianlong
    Guo, Li
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5056 - 5064
  • [2] Image captioning with transformer and knowledge graph
    Zhang, Yu
    Shi, Xinyu
    Mi, Siya
    Yang, Xu
    [J]. PATTERN RECOGNITION LETTERS, 2021, 143 (143) : 43 - 49
  • [3] External knowledge-assisted Transformer for image captioning
    Li, Zhixin
    Su, Qiang
    Chen, Tianyu
    [J]. IMAGE AND VISION COMPUTING, 2023, 140
  • [4] Distance Transformer for Image Captioning
    Wang, Jiarong
    Lu, Tongwei
    Liu, Xuanxuan
    Yang, Qi
    [J]. 2021 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION ENGINEERING (RCAE 2021), 2021, : 73 - 76
  • [5] Rotary Transformer for Image Captioning
    Qiu, Yile
    Zhu, Li
    [J]. SECOND INTERNATIONAL CONFERENCE ON OPTICS AND IMAGE PROCESSING (ICOIP 2022), 2022, 12328
  • [6] Entangled Transformer for Image Captioning
    Li, Guang
    Zhu, Linchao
    Liu, Ping
    Yang, Yi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8927 - 8936
  • [7] ACORT: A compact object relation transformer for parameter efficient image captioning
    Tan, Jia Huei
    Tan, Ying Hua
    Chan, Chee Seng
    Chuah, Joon Huang
    [J]. NEUROCOMPUTING, 2022, 482 : 60 - 72
  • [8] Boosted Transformer for Image Captioning
    Li, Jiangyun
    Yao, Peng
    Guo, Longteng
    Zhang, Weicun
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (16):
  • [9] Prior Knowledge-Guided Transformer for Remote Sensing Image Captioning
    Meng, Lingwu
    Wang, Jing
    Yang, Yang
    Xiao, Liang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 13
  • [10] Complementary Shifted Transformer for Image Captioning
    Liu, Yanbo
    Yang, You
    Xiang, Ruoyu
    Ma, Jixin
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (06) : 8339 - 8363