Towards Few-shot Image Captioning with Cycle-based Compositional Semantic Enhancement Framework

被引:0
|
作者
Zhang, Peng [1 ]
Bai, Yang [1 ]
Su, Jie [2 ]
Huang, Yan [3 ]
Long, Yang [1 ]
机构
[1] Univ Durham, Dept Comp Sci, Durham, England
[2] Newcasle Univ, Dept Comp Sci, Newcastle Upon Tyne, Tyne & Wear, England
[3] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
关键词
Image captioning; cycle-based; switcher module; LANGUAGE;
D O I
10.1109/IJCNN54540.2023.10191558
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many efforts paid attention to the multi-modal task, of which image captioning is a classic work. Especially the Clip model improves the performance of image captioning; meantime, its few-shot and zero-shot problems have become a significant research project. In this work, aiming at the image captioning task, we design the new few-shot and zero-shot settings different from popular directions. The direction focuses on the impact of the exited dataset for captioning model ability. According to analysis, we discover the frequency of the word combination can directly influence the performance of the captioning model. Based on this, we define the new few-shot and zero-shot settings. In terms of this, a Cycle-based captioning framework based on data augmentation is proposed to overcome this problem, of which the novelty switcher module is the critical component. Finally, experiments demonstrate that our framework can achieve state-of-the-art performance on both traditional, few-shot and zeroshot settings.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] FRIC: a framework for few-shot remote sensing image captioning
    Zhou, Haonan
    Xia, Lurui
    Du, Xiaoping
    Li, Sen
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
  • [2] Self-Distillation for Few-Shot Image Captioning
    Chen, Xianyu
    Jiang, Ming
    Zhao, Qi
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 545 - 555
  • [3] Unsupervised Semantic Segmentation with Feature Enhancement for Few-shot Image Classification
    Li, Xiang
    Xu, Zhuoming
    Xu, Qi
    Tang, Yan
    2022 TENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA, CBD, 2022, : 104 - 109
  • [4] Semantic Prompt for Few-Shot Image Recognition
    Chen, Wentao
    Si, Chenyang
    Zhang, Zhang
    Wang, Liang
    Wang, Zilei
    Tan, Tieniu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23581 - 23591
  • [5] An Image Enhancement Method for Few-shot Classification
    Wu, Benze
    Wu, Yirui
    Wan, Shaohua
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2021), 2021, : 159 - 165
  • [6] A Image Enhancement Method for Few-shot Classification
    Wu, Benze
    Wu, Yirui
    Wan, Shaohua
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2021), 2021, : 201 - 207
  • [7] Self-Learning for Few-Shot Remote Sensing Image Captioning
    Zhou, Haonan
    Du, Xiaoping
    Xia, Lurui
    Li, Sen
    REMOTE SENSING, 2022, 14 (18)
  • [8] Survey on Image Semantic Segmentation in Dilemma of Few-Shot
    Wei, Ting
    Li, Xinlei
    Liu, Hui
    Computer Engineering and Applications, 2024, 59 (02) : 1 - 11
  • [9] Few-Shot Knowledge Graph Completion Based on Subgraph Structure Semantic Enhancement
    Yang, Rongtai
    Shao, Yubin
    Du, Qingzhi
    Long, Hua
    Ma, Dinan
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2024, 47 (04): : 71 - 76
  • [10] Enhancement of Few-shot Image Classification Using Eigenimages
    Jonghyun Ko
    Wonzoo Chung
    International Journal of Control, Automation and Systems, 2023, 21 : 4088 - 4097