Sequential Memory Modelling for Video Captioning

被引:0
|
作者
Puttaraja [1 ]
Nayaka, Chidambara [1 ]
Manikesh [1 ]
Sharma, Nitin [1 ]
Anand, Kumar M. [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Surathkal 575025, India
关键词
Deep learning; NLP; Natural Language Processing; LSTM; Encoder-Decoder Model;
D O I
10.1109/INDICON56171.2022.10039829
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, the automatic generation of natural language descriptions of video has focused on deep learning research and natural voice processing. Video understanding has multiple applications such as video search and indexing, but video subtitles are a correct sophisticated topic for complex and diverse types of video content. However, the understanding between video and natural language sets remains an open issue to better understand the video and create multiple methods to create a set automatically. The deep learning method has a major focus on the direction of video processing with performance and highspeed computing capabilities. This polling discusses an encoderdecoder network end-in-frame based on a deep learning approach to generate caption. In this paper we will describe the model, dataset and parameters used to evaluate the model.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] MIRA-CAP: Memory-Integrated Retrieval-Augmented Captioning for State-of-the-Art Image and Video Captioning
    Umirzakova, Sabina
    Muksimova, Shakhnoza
    Mardieva, Sevara
    Sultanov Baxtiyarovich, Murodjon
    Cho, Young-Im
    [J]. Sensors, 2024, 24 (24)
  • [22] Streamlined Dense Video Captioning
    Mun, Jonghwan
    Yang, Linjie
    Ren, Zhou
    Xu, Ning
    Han, Bohyung
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3581 - +
  • [23] Video Captioning with Listwise Supervision
    Liu, Yuan
    Li, Xue
    Shi, Zhongchao
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4197 - 4203
  • [24] Sequence in sequence for video captioning
    Wang, Huiyun
    Gao, Chongyang
    Han, Yahong
    [J]. PATTERN RECOGNITION LETTERS, 2020, 130 (130) : 327 - 334
  • [25] Reconstruction Network for Video Captioning
    Wang, Bairui
    Ma, Lin
    Zhang, Wei
    Liu, Wei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7622 - 7631
  • [26] Video Captioning with Semantic Guiding
    Yuan, Jin
    Tian, Chunna
    Zhang, Xiangnan
    Ding, Yuxuan
    Wei, Wei
    [J]. 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [27] Multirate Multimodal Video Captioning
    Yang, Ziwei
    Xu, Youjiang
    Wang, Huiyun
    Wang, Bo
    Han, Yahong
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1877 - 1882
  • [28] Video Captioning with Tube Features
    Zhao, Bin
    Li, Xuelong
    Lu, Xiaoqiang
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1177 - 1183
  • [29] Survey of Dense Video Captioning
    Huang, Xiankai
    Zhang, Jiayu
    Wang, Xinyu
    Wang, Xiaochuan
    Liu, Ruijun
    [J]. Computer Engineering and Applications, 2023, 59 (12): : 28 - 48
  • [30] Video Captioning by Adversarial LSTM
    Yang, Yang
    Zhou, Jie
    Ai, Jiangbo
    Bin, Yi
    Hanjalic, Alan
    Shen, Heng Tao
    Ji, Yanli
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5600 - 5611