Video to Text Study using an Encoder-Decoder Networks Approach

被引:0
|
作者
Ismael Orozco, Carlos [1 ]
Elena Buemi, Maria [2 ]
Jacobo Berlles, Julio [2 ]
机构
[1] Univ Nacl Salta, FCE, Dept Informat, Salta, Argentina
[2] Univ Buenos Aires, FCEyN, Dept Comp, Buenos Aires, DF, Argentina
关键词
Video Summarization; Long Short-Term Memory; Deep Learning; Natural Language Processing;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The automatic generation of video description is currently a topic of interest in computer vision due to applications such as web indexation, video description for people with visual disabilities, among others. In this work we present a Neural Network architecture Encoder-Decoder. First, a Convolutional Neural Network 3D extracts the features of the input video. Then, an Long Short-Term Memory decodes the vector to automatically generate the description of the video. To perform the training and testing we use the Microsoft Video Description Corpus data set (MSVD). Evaluate the performance of our system using the challenge of COCO Image Captioning Challenge. We obtain as results 0.3984, 0.2941 and 0.5052 for the BLEU, METEOR and CIDEr metrics respectively. Competitive results compared with certificates in the bibliography.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Text Normalization Using Encoder-Decoder Networks Based on the Causal Feature Extractor
    Javaloy, Adrian
    Garcia-Mateos, Gines
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (13):
  • [2] On Mining Conditions using Encoder-decoder Networks
    Gallego, Fernando O.
    Corchuelo, Rafael
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 624 - 630
  • [3] Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning
    Chen, Jingwen
    Pan, Yingwei
    Li, Yehao
    Yao, Ting
    Chao, Hongyang
    Mei, Tao
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [4] Video Summarization With Attention-Based Encoder-Decoder Networks
    Ji, Zhong
    Xiong, Kailin
    Pang, Yanwei
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1709 - 1717
  • [5] Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
    Chen, Jingwen
    Pan, Yingwei
    Li, Yehao
    Yao, Ting
    Chao, Hongyang
    Mei, Tao
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8167 - 8174
  • [6] AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
    Kass, Dmitrijs
    Vats, Ekta
    [J]. DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 507 - 522
  • [7] Unsupervised Feature Selection using Encoder-Decoder Networks
    SharifiPour, Sasan
    Fayyazi, Hossein
    Sabokro, Mohammad
    [J]. 2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [8] An automated choroid segmentation approach using transfer learning and encoder-decoder networks
    Suthaharan, Shan
    Chhablani, Gunjan
    Vupparaboina, Kiran Kumar
    Sahel, Jose-Alain
    Dansingani, Kunal K.
    Chhablani, Jay
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2021, 62 (08)
  • [9] Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks
    Kobayashi, Shimpei
    Hizukuri, Akiyoshi
    Nakayama, Ryohei
    [J]. 2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [10] Correlation Encoder-Decoder Model for Text Generation
    Zhang, Xu
    Li, Yifeng
    Peng, Xueping
    Qiao, Xinxiao
    Zhang, Hui
    Lu, Wenpeng
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,