Attention Based End-to-End Network for Short Video Classification

被引:0
|
作者
Zhu, Hui [1 ]
Zou, Chao [2 ]
Wang, Zhenyu [2 ]
Xu, Kai [2 ]
Huang, Zihao [2 ]
机构
[1] Guangdong Mech & Elect Polytech, Sch Econ & Trade, Guangzhou, Peoples R China
[2] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China
关键词
Short Video Classification; Deep Learning; Convolutional Neural Network; Self-Attention Mechanism;
D O I
10.1109/MSN57253.2022.00084
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It has been proved that three-dimensional (3D) convolutional kernel can effectively capture local features in the spatiotemporal range of videos, leading to impressive results of various models in video-related tasks. With the introduction of Transformer and the rise of self-attention mechanism, more self-attention models have been used on video representation learning recently. However, there exist limitations of local perception and self-attention operations respectively in both two types of models. Inspired by the global context network (GCNet), we take advantages of both 3D convolution and self-attention mechanism to design a novel operator called the GC-Conv block. The block performs local feature extraction and global context modeling with channel-level concatenation similarly to the dense connectivity pattern in DenseNet, which maintains the lightweight property at the same time. Furthermore, we apply it for multiple layers of our proposed end-to-end network in short video classification task while the temporal dependency is captured via dilated convolutions and bidirectional GRU for better representation. Finally, our model outperforms both state-of-the-art convolutional models and self-attention models on three human action recognition datasets with considerably fewer parameters, which demonstrates the effectiveness.
引用
收藏
页码:490 / 494
页数:5
相关论文
共 50 条
  • [1] An end-to-end model for ECG signals classification based on residual attention network
    Lu, Xiang
    Wang, Xingrui
    Zhang, Wanying
    Wen, Anhao
    Ren, Yande
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
  • [2] Gated End-to-End Memory Network Based on Attention Mechanism
    Zhou, Bin
    Dang, Xin
    [J]. 2018 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT), 2018,
  • [3] Attention-based end-to-end image defogging network
    Yang, Yan
    Zhang, Chen
    Jiang, Peipei
    Yue, Hui
    [J]. ELECTRONICS LETTERS, 2020, 56 (15) : 759 - +
  • [4] Attention-based neural network for end-to-end music separation
    Wang, Jing
    Liu, Hanyue
    Ying, Haorong
    Qiu, Chuhan
    Li, Jingxin
    Anwar, Muhammad Shahid
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 355 - 363
  • [5] An End-to-End Video Steganography Network Based on a Coding Unit Mask
    Chai, Huanhuan
    Li, Zhaohong
    Li, Fan
    Zhang, Zhenzhen
    [J]. ELECTRONICS, 2022, 11 (07)
  • [6] SWINBERT: End-to-End Transformers with Sparse Attention for Video Captioning
    Lin, Kevin
    Li, Linjie
    Lin, Chung-Ching
    Ahmed, Faisal
    Gan, Zhe
    Liu, Zicheng
    Lu, Yumao
    Wang, Lijuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17928 - 17937
  • [7] Network analysis on Skype end-to-end video quality
    Exarchakos, George
    Druda, Luca
    Menkovski, Vlado
    Liotta, Antonio
    [J]. INTERNATIONAL JOURNAL OF PERVASIVE COMPUTING AND COMMUNICATIONS, 2015, 11 (01) : 17 - +
  • [8] End-to-End Blind Video Quality Assessment Based on Visual and Memory Attention Modeling
    Guan, Xiaodi
    Li, Fan
    Zhang, Yangfan
    Cosman, Pamela C.
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5206 - 5221
  • [9] MSARN: A Multi-scale Attention Residual Network for End-to-End Environmental Sound Classification
    Fucai Hu
    Peng Song
    Ruhan He
    Zhaoli Yan
    Yongsheng Yu
    [J]. Neural Processing Letters, 2023, 55 : 11449 - 11465
  • [10] MSARN: A Multi-scale Attention Residual Network for End-to-End Environmental Sound Classification
    Hu, Fucai
    Song, Peng
    He, Ruhan
    Yan, Zhaoli
    Yu, Yongsheng
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11449 - 11465