Video Description Using Bidirectional Recurrent Neural Networks

被引:18
|
作者
Peris, Alvaro [1 ]
Bolanos, Marc [2 ,3 ]
Radeva, Petia [2 ,3 ]
Casacuberta, Francisco [1 ]
机构
[1] Univ Politecn Valencia, PRHLT Res Ctr, Valencia, Spain
[2] Univ Barcelona, Barcelona, Spain
[3] Comp Vision Ctr, Bellaterra, Spain
关键词
Video description; Neural Machine Translation; Birectional Recurrent Neural Networks; LSTM; Convolutional Neural Networks;
D O I
10.1007/978-3-319-44781-0_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this work we propose pushing further this model by introducing two contributions into the encoding stage. First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames.
引用
收藏
页码:3 / 11
页数:9
相关论文
共 50 条
  • [31] Predicting the empirical distribution of video quality scores using recurrent neural networks
    Otroshi Shahreza H.
    Amini A.
    Behroozi H.
    International Journal of Engineering, Transactions B: Applications, 2020, 33 (05): : 984 - 991
  • [32] Video Retrieval System Using Parallel Multi-Class Recurrent Neural Network Based on Video Description
    Jabeen, Saira
    Khan, Gulraiz
    Naveed, Humza
    Khan, Zeeshan
    Khan, Usman Ghani
    2018 14TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET), 2018,
  • [33] Video Super-Resolution via Bidirectional Recurrent Convolutional Networks
    Huang, Yan
    Wang, Wei
    Wang, Liang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 1015 - 1028
  • [34] Recurrent Graph Neural Networks for Video Instance Segmentation
    Brissman, Emil
    Johnander, Joakim
    Danelljan, Martin
    Felsberg, Michael
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (02) : 471 - 495
  • [35] Recurrent Neural Networks for Online Video Popularity Prediction
    Trzcinski, Tomasz
    Andruszkiewicz, Pawel
    Bochenski, Tomasz
    Rokita, Przemyslaw
    FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 146 - 153
  • [36] Recurrent Graph Neural Networks for Video Instance Segmentation
    Emil Brissman
    Joakim Johnander
    Martin Danelljan
    Michael Felsberg
    International Journal of Computer Vision, 2023, 131 : 471 - 495
  • [37] Unsupervised Video Summarization with Independently Recurrent Neural Networks
    Yaliniz, Gokhan
    Ikizler-Cinbis, Nazli
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [38] Folded Recurrent Neural Networks for Future Video Prediction
    Oliu, Marc
    Selva, Javier
    Escalera, Sergio
    COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 745 - 761
  • [39] ASR ERROR DETECTION AND RECOGNITION RATE ESTIMATION USING DEEP BIDIRECTIONAL RECURRENT NEURAL NETWORKS
    Ogawa, Atsunori
    Hori, Takaaki
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4370 - 4374
  • [40] Detection of Paroxysmal Atrial Fibrillation using Attention-based Bidirectional Recurrent Neural Networks
    Shashikumar, Supreeth P.
    Shah, Amit J.
    Clifford, Gari D.
    Nemati, Shamim
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 715 - 723