Text Augmentation for Compressed Image Captioning Models

被引:0
|
作者
Atliha, Viktar [1 ]
Sesok, Dmitrij [1 ]
机构
[1] Vilnius Gediminas Tech Univ, Dept Informat Technol, Vilnius, Lithuania
关键词
image captioning; model compression; text augmentation;
D O I
10.1109/ESTREAM56157.2022.9781641
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The field of compressing image captioning models to be suitable for real-time mobile devices usage remains under-explored despite its high practical value. Recent researches showed a huge progress in this topic compressing classical state-of-the-art models using basic models size reduction methods. However, some more sophisticated approaches that showed great results for ordinary image captioning models quality improvements are often left behind. One of such techniques is captions augmentation. It appeared to help for big uncompressed models but its influence on smaller models metrics wasn't clear. In this paper we show that along with the other image captioning models compression techniques text augmentations could help to improve quality leaving models size small enough to fit mobile devices.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Text Augmentation Using BERT for Image Captioning
    Atliha, Viktar
    Sesok, Dmitrij
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [2] Multimodal Data Augmentation for Image Captioning using Diffusion Models
    Xiao, Changrong
    Xu, Sean Xin
    Zhang, Kunpeng
    [J]. PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 23 - 33
  • [3] Image Captioning using Deep Learning: Text Augmentation by Paraphrasing via Backtranslation
    Turkerud, Ingrid Ravn
    Mengshoel, Ole Jakob
    [J]. 2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [4] Text to Image Synthesis for Improved Image Captioning
    Hossain, Md. Zakir
    Sohel, Ferdous
    Shiratuddin, Mohd Fairuz
    Laga, Hamid
    Bennamoun, Mohammed
    [J]. IEEE ACCESS, 2021, 9 : 64918 - 64928
  • [5] Visual to Text: Survey of Image and Video Captioning
    Li, Sheng
    Tao, Zhiqiang
    Li, Kang
    Fu, Yun
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2019, 3 (04): : 297 - 312
  • [6] Image Captioning Generator Text-to-Speech
    Sharma, Tripti
    Anand, Neetu
    Gaurav, Kumar
    Kapur, Rohit
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2022, 13 (03): : 448 - 457
  • [7] SMALLCAP: Lightweight Image Captioning Prompted with Retrieval Augmentation
    Ramos, Rita
    Martins, Bruno
    Elliott, Desmond
    Kementchedjhieva, Yova
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2840 - 2849
  • [8] Switching Text-Based Image Encoders for Captioning Images With Text
    Ueda, Arisa
    Yang, Wei
    Sugiura, Komei
    [J]. IEEE ACCESS, 2023, 11 : 55706 - 55715
  • [9] Image captioning by diffusion models: A survey
    Daneshfar, Fatemeh
    Bartani, Ako
    Lotfi, Pardis
    [J]. Engineering Applications of Artificial Intelligence, 2024, 138
  • [10] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
    Yue Ming
    Nannan Hu
    Chunxiao Fan
    Fan Feng
    Jiangwan Zhou
    Hui Yu
    [J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9 (08) : 1339 - 1365