Collaborative Learning Method for Natural Image Captioning

被引:0
|
作者
Wang, Rongzhao [1 ]
Liu, Libo [1 ]
机构
[1] Ningxia Univ, Sch Informat Engn, Yinchuan, Peoples R China
来源
关键词
Image captioning; Pix2pix inverting; Collaborative learning;
D O I
10.1007/978-981-19-5194-7_19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a collaborative learning method to solve the natural image captioning problem. Numerous existing methods use pretrained image classification CNNs to obtain feature representations for image caption generation, which ignores the gap in image feature representations between different computer vision tasks. To address this problem, our method aims to utilize the similarity between image caption and pix-to-pix inverting tasks to ease the feature representation gap. Specifically, our framework consists of two modules: 1) The pix2pix module (P2PM), which has a share learning feature extractor to extract feature representations and a U-net architecture to encode the image to latent code and then decodes them to the original image. 2) The natural language generation module (NLGM) generates descriptions from feature representations extracted by P2PM. Consequently, the feature representations and generated image captions are improved during the collaborative learning process. The experimental results on the MSCOCO 2017 dataset prove the effectiveness of our approach compared to other comparison methods.
引用
收藏
页码:249 / 261
页数:13
相关论文
共 50 条
  • [1] A Hybridized Deep Learning Method for Bengali Image Captioning
    Humaira, Mayeesha
    Paul, Shimul
    Jim, Md Abidur Rahman Khan
    Ami, Amit Saha
    Shah, Faisal Muhammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 698 - 707
  • [2] Deep Learning for automatically describing images in natural language - Image Captioning
    Hotaran, Anca Mihaela
    Vrejoiu, Mihnea Horia
    ROMANIAN JOURNAL OF INFORMATION TECHNOLOGY AND AUTOMATIC CONTROL-REVISTA ROMANA DE INFORMATICA SI AUTOMATICA, 2020, 30 (01): : 87 - 100
  • [3] Improving Reinforcement Learning Based Image Captioning with Natural Language Prior
    Guo, Tszhang
    Chang, Shiyu
    Yu, Mo
    Bai, Kun
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 751 - 756
  • [4] Contrastive Learning for Image Captioning
    Dai, Bo
    Lin, Dahua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Learning to Evaluate Image Captioning
    Cui, Yin
    Yang, Guandao
    Veit, Andreas
    Huang, Xun
    Belongie, Serge
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5804 - 5812
  • [6] Meta Learning for Image Captioning
    Li, Nannan
    Chen, Zhenzhong
    Liu, Shan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8626 - 8633
  • [7] Collaborative strategy network for spatial attention image captioning
    Dongming Zhou
    Jing Yang
    Riqiang Bao
    Applied Intelligence, 2022, 52 : 9017 - 9032
  • [8] Dual-Level Collaborative Transformer for Image Captioning
    Luo, Yunpeng
    Ji, Jiayi
    Sun, Xiaoshuai
    Cao, Liujuan
    Wu, Yongjian
    Huang, Feiyue
    Lin, Chia-Wen
    Ji, Rongrong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2286 - 2293
  • [9] Collaborative strategy network for spatial attention image captioning
    Zhou, Dongming
    Yang, Jing
    Bao, Riqiang
    APPLIED INTELLIGENCE, 2022, 52 (08) : 9017 - 9032
  • [10] Deep Learning for Military Image Captioning
    Das, Subrata
    Jain, Lalit
    Das, Amp
    2018 21ST INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2018, : 2165 - 2171