Image caption generation using transformer learning methods: a case study on instagram image

被引:0
|
作者
Kwankamon Dittakan
Kamontorn Prompitak
Phutphisit Thungklang
Chatchawan Wongwattanakit
机构
[1] Prince of Songkla University,College of Computing and Faculty of Hospitality and Tourism
[2] Phuket Campus,undefined
来源
关键词
Image Captioning; Transformer Learning Model; Self-Attention Mechanism; Encoder-Decoder; Image feature extraction; Instagram image;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, images are being used more extensively for communication purposes. A single image can convey a variety of stories, depending on the perspective and thoughts of everyone who views it. To facilitate comprehension, inclusion image captions is highly beneficial, especially for individuals with visual impairments who can read Braille or rely on audio descriptions. The purpose of this research is to create an automatic captioning system that is easy to understand and quick to generate. This system can be applied to other related systems. In this research, the transformer learning process is applied to image captioning instead of the convolutional neural networks (CNN) and recurrent neural networks (RNN) process which has limitations in processing long-sequence data and managing data complexity. The transformer learning process can handle these limitations well and more efficiently. Additionally, the image captioning system was trained on a dataset of 5,000 images from Instagram that were tagged with the hashtag "Phuket" (#Phuket). The researchers also wrote the captions themselves to use as a dataset for testing the image captioning system. The experiments showed that the transformer learning process can generate natural captions that are close to human language. The generated captions will also be evaluated using the Bilingual Evaluation Understudy (BLEU) score and Metric for Evaluation of Translation with Explicit Ordering (METEOR) score, a metric for measuring the similarity between machine-translated text and human-written text. This will allow us to compare the resemblance between the researcher-written captions and the transformer-generated captions.
引用
收藏
页码:46397 / 46417
页数:20
相关论文
共 50 条
  • [31] Automatic Image Caption Generation Using ResNet & Torch Vision
    Verma, Vijeta
    Saritha, Sri Khetwat
    Jain, Sweta
    MACHINE LEARNING, IMAGE PROCESSING, NETWORK SECURITY AND DATA SCIENCES, MIND 2022, PT II, 2022, 1763 : 82 - 101
  • [32] Assamese news image caption generation using attention mechanism
    Das, Ringki
    Singh, Thoudam Doren
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (07) : 10051 - 10069
  • [33] Generation of a short narrative caption for an image using the suggested hashtag
    Gaur, Shivam
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2019), 2019, : 331 - 337
  • [34] Assamese news image caption generation using attention mechanism
    Ringki Das
    Thoudam Doren Singh
    Multimedia Tools and Applications, 2022, 81 : 10051 - 10069
  • [35] An Efficient Deep Learning based Hybrid Model Image Caption Generation for
    Kaur, Mehzabeen
    Kaur, Harpreet
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (03) : 231 - 237
  • [36] Multilevel Attention Networks and Policy Reinforcement Learning for Image Caption Generation
    Zhou, Zhibo
    Zhang, Xiaoming
    Li, Zhoujun
    Huang, Feiran
    Xu, Jie
    BIG DATA, 2022, 10 (06) : 481 - 492
  • [37] Deep learning for ultrasound image caption generation based on object detection
    Zeng, Xianhua
    Wen, Li
    Liu, Banggui
    Qi, Xiaojun
    NEUROCOMPUTING, 2020, 392 : 132 - 141
  • [38] Automatic Image Caption Generation Based on Some Machine Learning Algorithms
    Predic, Bratislav
    Manic, Dasa
    Saracevic, Muzafer
    Karabasevic, Darjan
    Stanujkic, Dragisa
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [39] Deep learning for ultrasound image caption generation based on object detection
    Zeng X.
    Wen L.
    Liu B.
    Qi X.
    Neurocomputing, 2020, 392 : 132 - 141
  • [40] Automatic Image Caption Generation Based on Some Machine Learning Algorithms
    Predic, Bratislav
    Manic, Dasa
    Saracevic, Muzafer
    Karabasevic, Darjan
    Stanujkic, Dragisa
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022