An Integrated Hybrid CNN-RNN Model for Visual Description and Generation of Captions

被引:18
|
作者
Khamparia, Aditya [1 ]
Pandey, Babita [2 ]
Tiwari, Shrasti [3 ]
Gupta, Deepak [4 ]
Khanna, Ashish [4 ]
Rodrigues, Joel J. P. C. [5 ,6 ]
机构
[1] Lovely Profess Univ, Sch Comp Sci & Engn, Phagwara, Punjab, India
[2] Babasaheb Bhimrao Ambedkar Univ, Dept Comp Sci & IT, Satellite Campus, Amethi, UP, India
[3] Lovely Profess Univ, Div Examinat, Phagwara, Punjab, India
[4] Maharaja Agrasen Inst Technol, Delhi, India
[5] Fed Univ Piaui UFPI, Teresina, PI, Brazil
[6] Inst Telecomunicacoes, Lisbon, Portugal
关键词
Captions; Long short-term memory; Convolutional neural network; Recurrent neural network; Feature vectors; Extraction;
D O I
10.1007/s00034-019-01306-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video captioning is currently considered to be one of the simplest ways to index and search data efficiently. In today's era, suitable captioning of video images can be facilitated with deep learning architectures. The focus of past research has been on providing image captions; however, the generation of high-quality captions with suitable semantics for different scenes has not yet been achieved. Therefore, this work aims to generate well-defined and meaningful captions to images and videos by using convolutional neural networks (CNN) and recurrent neural networks in combination. Beginning with the available dataset, features of images and videos were extracted using CNN. The extracted feature vectors were then utilized to generate a language model with the involvement of long short-term memory for individual word grams. The generated meaningful captions were trained using a softmax function, for performance computation using some predefined evaluation metrics. The obtained experimental results demonstrate that the proposed model outperforms existing benchmark models.
引用
收藏
页码:776 / 788
页数:13
相关论文
共 50 条
  • [1] An Integrated Hybrid CNN–RNN Model for Visual Description and Generation of Captions
    Aditya Khamparia
    Babita Pandey
    Shrasti Tiwari
    Deepak Gupta
    Ashish Khanna
    Joel J. P. C. Rodrigues
    [J]. Circuits, Systems, and Signal Processing, 2020, 39 : 776 - 788
  • [2] A Hierarchical CNN-RNN Approach for Visual Emotion Classification
    Li, Liang
    Zhu, Xinge
    Hao, Yiming
    Wang, Shuhui
    Gao, Xingyu
    Huang, Qingming
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (03)
  • [3] Improving CNN-RNN Hybrid Networks for Handwriting Recognition
    Dutta, Kartik
    Krishnan, Praveen
    Mathew, Minesh
    Jawahar, C. V.
    [J]. PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 80 - 85
  • [4] CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification
    Guo, Long
    Zhang, Dongxiang
    Wang, Lei
    Wang, Han
    Cui, Bin
    [J]. CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 571 - 585
  • [5] Predicting Beijing Air Quality Using Bayesian Optimized CNN-RNN Hybrid Model
    Tu, Zihan
    Wu, Zhe
    [J]. 2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 581 - 587
  • [6] CRAN: An Hybrid CNN-RNN Attention-Based Model for Arabic Machine Translation
    Bensalah, Nouhaila
    Ayad, Habib
    Adib, Abdellah
    El Farouk, Abdelhamid Ibn
    [J]. NETWORKING, INTELLIGENT SYSTEMS AND SECURITY, 2022, 237 : 87 - 102
  • [7] A hybrid CNN-RNN model for rainfall-runoff modeling in the Potteruvagu watershed of India
    Shekar, Padala Raja
    Mathew, Aneesh
    Sharma, Kul Vaibhav
    [J]. CLEAN-SOIL AIR WATER, 2024,
  • [8] Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition
    Zhu, Xinge
    Li, Liang
    Zhang, Weigang
    Rao, Tianrong
    Xu, Min
    Huang, Qingming
    Xu, Dong
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3595 - 3601
  • [9] A CNN-RNN Hybrid Model with 2D Wavelet Transform Layer for Image Classification
    Dong, Zihao
    Zhang, Ruixun
    Shao, Xiuli
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1050 - 1056
  • [10] An OpenCL-Based Hybrid CNN-RNN Inference Accelerator On FPGA
    Sun, Yunfei
    Liu, Brian
    Xu, Xianchao
    [J]. 2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 283 - 286