Attention-Based Bidirectional Recurrent Neural Networks for Description Generation of Videos

被引:0
|
作者
Du, Xiaotong [1 ]
Yuan, Jiabin [1 ]
Liu, Hu [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China
来源
CLOUD COMPUTING AND SECURITY, PT VI | 2018年 / 11068卷
关键词
Video description; Convolutional Neural Networks; Bidirectional Recurrent Neural Networks; Attention mechanism;
D O I
10.1007/978-3-030-00021-9_40
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Describing videos in human language is of vital importance in many applications, such as managing massive videos on line and providing descriptive video service (DVS) for blind people. In order to further promote existing video description frameworks, this paper presents an end-to-end deep learning model incorporating Convolutional Neural Networks (CNNs) and Bidirectional Recurrent Neural Networks (BiRNNs) based on a multimodal attention mechanism. Firstly, the model produces richer video representations, including image feature, motion feature and audio feature, than other similar researches. Secondly, BiRNNs model encodes these features in both forward and backward directions. Finally, an attention-based decoder translates sequential outputs of encoder to sequential words. The model is evaluated on Microsoft Research Video Description Corpus (MSVD) dataset. The results demonstrate the necessity of combining BiRNNs with a multimodal attention mechanism and the superiority of this model over other state-of-the-art methods conducted on this dataset.
引用
收藏
页码:440 / 451
页数:12
相关论文
共 50 条
  • [31] End-to-end Language Identification using Attention-based Recurrent Neural Networks
    Geng, Wang
    Wang, Wenfu
    Zhao, Yuanyuan
    Cai, Xinyuan
    Xu, Bo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2944 - 2948
  • [32] Deep Personalized Glucose Level Forecasting Using Attention-based Recurrent Neural Networks
    Armandpour, Mohammadreza
    Kidd, Brian
    Du, Yu
    Huang, Jianhua Z.
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [33] Decomposition aided attention-based recurrent neural networks for multistep ahead time-series forecasting of renewable power generation
    Damasevicius, Robertas
    Jovanovic, Luka
    Petrovic, Aleksandar
    Zivkovic, Miodrag
    Bacanin, Nebojsa
    Jovanovic, Dejan
    Antonijevic, Milos
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [34] Speech Emotion Recognition via Generation using an Attention-based Variational Recurrent Neural Network
    Baruah, Murchana
    Banerjee, Bonny
    INTERSPEECH 2022, 2022, : 4710 - 4714
  • [35] An attention-based bidirectional GRU network for temporal action proposals generation
    Xiaoxin Liao
    Jingyi Yuan
    Zemin Cai
    Jian-huang Lai
    The Journal of Supercomputing, 2023, 79 : 8322 - 8339
  • [36] An attention-based bidirectional GRU network for temporal action proposals generation
    Liao, Xiaoxin
    Yuan, Jingyi
    Cai, Zemin
    Lai, Jian-huang
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (08): : 8322 - 8339
  • [37] PredictPTB: an interpretable preterm birth prediction model using attention-based recurrent neural networks
    AlSaad, Rawan
    Malluhi, Qutaibah
    Boughorbel, Sabri
    BIODATA MINING, 2022, 15 (01)
  • [38] PredictPTB: an interpretable preterm birth prediction model using attention-based recurrent neural networks
    Rawan AlSaad
    Qutaibah Malluhi
    Sabri Boughorbel
    BioData Mining, 15
  • [39] Attention-based Convolutional Neural Networks for Sentence Classification
    Zhao, Zhiwei
    Wu, Youzheng
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 705 - 709
  • [40] Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly Detection
    Wu, Shang
    Wang, Yijie
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1452 - 1459