A novel automatic image caption generation using bidirectional long-short term memory framework

被引:0
|
作者
Zhongfu Ye
Rashid Khan
Nuzhat Naqvi
M. Shujah Islam
机构
[1] University of Science and Technology of China,
来源
关键词
Image captioning; inception v3; B-LSTM; P-MFO optimization; Bleu score;
D O I
暂无
中图分类号
学科分类号
摘要
Image Captioning, the process of generating a textual description of an image, has emerged as a hot research due to its practical importance in many domains. It is a challenging task as it uses both Natural Language Processing and Computer Vision related fields to generate the captions. Despite the fact that the literature has reported notable image captioning methodologies, they still lag in accomplishing the substantial performance level for diverse datasets. This paper proposes an image caption generating mechanism based on Optimized Bidirectional Long Short-Term Memory (B-LSTM) model. We propose a variant of Moth Flame Optimization (PMFO), termed here as Proposed Moth Flame Optimization (PMFO), which has logarithmic spiral update based on correlation. The performance of the proposed model is demonstrated on benchmark datasets like Flicker 8 k, Flicker30k, VizWik and COCO datasets using renowned metrics such as CIDEr, BLEU, SPICE and ROUGH. The performance analysis proves that the B-LSTM achieves better performance on caption generation than state-of-the-art methods.
引用
收藏
页码:25557 / 25582
页数:25
相关论文
共 50 条
  • [1] A novel automatic image caption generation using bidirectional long-short term memory framework
    Ye, Zhongfu
    Khan, Rashid
    Naqvi, Nuzhat
    Islam, M. Shujah
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (17) : 25557 - 25582
  • [2] Guiding the Long-Short Term Memory model for Image Caption Generation
    Jia, Xu
    Gavves, Efstratios
    Fernando, Basura
    Tuytelaars, Tinne
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2407 - 2415
  • [3] Visual Attention Based on Long-Short Term Memory Model for Image Caption Generation
    Qu, Shiru
    Xi, Yuling
    Ding, Songtao
    [J]. 2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4789 - 4794
  • [4] Supervised Guiding Long-Short Term Memory for Image Caption Generation based on Object Classes
    Wang, Jian
    Cao, Zhiguo
    Xiao, Yang
    Qi, Xinyuan
    [J]. MIPPR 2017: PATTERN RECOGNITION AND COMPUTER VISION, 2017, 10609
  • [5] Long short-term memory network with external memories for image caption generation
    Jiang, Teng
    Zhan, Chengjun
    Yang, Yupu
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [6] Automatic detection of migraine disease from EEG signals using bidirectional long-short term memory deep learning model
    Goker, Hanife
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1255 - 1263
  • [7] Automatic detection of migraine disease from EEG signals using bidirectional long-short term memory deep learning model
    Hanife Göker
    [J]. Signal, Image and Video Processing, 2023, 17 : 1255 - 1263
  • [8] A novel framework for automatic caption and audio generation
    Kulkarni, Chaitanya
    Monika, P.
    Preeti, B.
    Shruthi, S.
    [J]. MATERIALS TODAY-PROCEEDINGS, 2022, 65 : 3248 - 3252
  • [9] Recognition of Dysfluency in Speech: A Bidirectional Long-Short Term Memory Based Approach
    Vinay, N. A.
    Bharathi, S. H.
    Aradhya, V. N. Manjunath
    [J]. APPLIED INTELLIGENCE AND INFORMATICS, AII 2021, 2021, 1435 : 232 - 244
  • [10] A BCI System with Motor Imagery Based on Bidirectional Long-Short Term Memory
    Lin, Jzau-Sheng
    She, Bing-Hong
    [J]. 3RD ANNUAL INTERNATIONAL CONFERENCE ON CLOUD TECHNOLOGY AND COMMUNICATION ENGINEERING, 2020, 719