Text Augmentation for Compressed Image Captioning Models

被引：0

作者：

Atliha, Viktar ^{[1
]}

Sesok, Dmitrij ^{[1
]}

机构：

[1] Vilnius Gediminas Tech Univ, Dept Informat Technol, Vilnius, Lithuania

来源：

2022 IEEE OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM) | 2022年

关键词：

image captioning; model compression; text augmentation;

D O I：

10.1109/ESTREAM56157.2022.9781641

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The field of compressing image captioning models to be suitable for real-time mobile devices usage remains under-explored despite its high practical value. Recent researches showed a huge progress in this topic compressing classical state-of-the-art models using basic models size reduction methods. However, some more sophisticated approaches that showed great results for ordinary image captioning models quality improvements are often left behind. One of such techniques is captions augmentation. It appeared to help for big uncompressed models but its influence on smaller models metrics wasn't clear. In this paper we show that along with the other image captioning models compression techniques text augmentations could help to improve quality leaving models size small enough to fit mobile devices.

引用

页数：4

共 50 条

[1] Text Augmentation Using BERT for Image Captioning
Atliha, Viktar
Sesok, Dmitrij
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (17):
[2] Multimodal Data Augmentation for Image Captioning using Diffusion Models
Xiao, Changrong
Xu, Sean Xin
Zhang, Kunpeng
[J]. PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 23 - 33
[3] Image Captioning using Deep Learning: Text Augmentation by Paraphrasing via Backtranslation
Turkerud, Ingrid Ravn
Mengshoel, Ole Jakob
[J]. 2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[4] Text to Image Synthesis for Improved Image Captioning
Hossain, Md. Zakir
Sohel, Ferdous
Shiratuddin, Mohd Fairuz
Laga, Hamid
Bennamoun, Mohammed
[J]. IEEE ACCESS, 2021, 9 : 64918 - 64928
[5] Visual to Text: Survey of Image and Video Captioning
Li, Sheng
Tao, Zhiqiang
Li, Kang
Fu, Yun
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2019, 3 (04): : 297 - 312
[6] Image Captioning Generator Text-to-Speech
Sharma, Tripti
Anand, Neetu
Gaurav, Kumar
Kapur, Rohit
[J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2022, 13 (03): : 448 - 457
[7] SMALLCAP: Lightweight Image Captioning Prompted with Retrieval Augmentation
Ramos, Rita
Martins, Bruno
Elliott, Desmond
Kementchedjhieva, Yova
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2840 - 2849
[8] Switching Text-Based Image Encoders for Captioning Images With Text
Ueda, Arisa
Yang, Wei
Sugiura, Komei
[J]. IEEE ACCESS, 2023, 11 : 55706 - 55715
[9] Image captioning by diffusion models: A survey
Daneshfar, Fatemeh
Bartani, Ako
Lotfi, Pardis
[J]. Engineering Applications of Artificial Intelligence, 2024, 138
[10] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
Yue Ming
Nannan Hu
Chunxiao Fan
Fan Feng
Jiangwan Zhou
Hui Yu
[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9 (08) : 1339 - 1365

← 1 2 3 4 5 →