Video to Text Study using an Encoder-Decoder Networks Approach

被引：0

作者：

Ismael Orozco, Carlos ^{[1
]}

Elena Buemi, Maria ^{[2
]}

Jacobo Berlles, Julio ^{[2
]}

机构：

[1] Univ Nacl Salta, FCE, Dept Informat, Salta, Argentina

[2] Univ Buenos Aires, FCEyN, Dept Comp, Buenos Aires, DF, Argentina

来源：

2018 37TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC) | 2018年

关键词：

Video Summarization; Long Short-Term Memory; Deep Learning; Natural Language Processing;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The automatic generation of video description is currently a topic of interest in computer vision due to applications such as web indexation, video description for people with visual disabilities, among others. In this work we present a Neural Network architecture Encoder-Decoder. First, a Convolutional Neural Network 3D extracts the features of the input video. Then, an Long Short-Term Memory decodes the vector to automatically generate the description of the video. To perform the training and testing we use the Microsoft Video Description Corpus data set (MSVD). Evaluate the performance of our system using the challenge of COCO Image Captioning Challenge. We obtain as results 0.3984, 0.2941 and 0.5052 for the BLEU, METEOR and CIDEr metrics respectively. Competitive results compared with certificates in the bibliography.

引用

页数：5

共 50 条

[1] Text Normalization Using Encoder-Decoder Networks Based on the Causal Feature Extractor
Javaloy, Adrian
Garcia-Mateos, Gines
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (13):
[2] On Mining Conditions using Encoder-decoder Networks
Gallego, Fernando O.
Corchuelo, Rafael
[J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 624 - 630
[3] Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning
Chen, Jingwen
Pan, Yingwei
Li, Yehao
Yao, Ting
Chao, Hongyang
Mei, Tao
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
[4] Video Summarization With Attention-Based Encoder-Decoder Networks
Ji, Zhong
Xiong, Kailin
Pang, Yanwei
Li, Xuelong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1709 - 1717
[5] Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Chen, Jingwen
Pan, Yingwei
Li, Yehao
Yao, Ting
Chao, Hongyang
Mei, Tao
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8167 - 8174
[6] AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
Kass, Dmitrijs
Vats, Ekta
[J]. DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 507 - 522
[7] Unsupervised Feature Selection using Encoder-Decoder Networks
SharifiPour, Sasan
Fayyazi, Hossein
Sabokro, Mohammad
[J]. 2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
[8] An automated choroid segmentation approach using transfer learning and encoder-decoder networks
Suthaharan, Shan
Chhablani, Gunjan
Vupparaboina, Kiran Kumar
Sahel, Jose-Alain
Dansingani, Kunal K.
Chhablani, Jay
[J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2021, 62 (08)
[9] Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks
Kobayashi, Shimpei
Hizukuri, Akiyoshi
Nakayama, Ryohei
[J]. 2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
[10] Correlation Encoder-Decoder Model for Text Generation
Zhang, Xu
Li, Yifeng
Peng, Xueping
Qiao, Xinxiao
Zhang, Hui
Lu, Wenpeng
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →