Temporal Bilinear Networks for Video Action Recognition

被引:0
|
作者
Li, Yanghao [1 ]
Song, Sijie [1 ]
Li, Yuqi [1 ]
Liu, Jiaying [1 ]
机构
[1] Peking Univ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal modeling in videos is a fundamental yet challenging problem in computer vision. In this paper, we propose a novel Temporal Bilinear (TB) model to capture the temporal pairwise feature interactions between adjacent frames. Compared with some existing temporal methods which are limited in linear transformations, our TB model considers explicit quadratic bilinear transformations in the temporal domain for motion evolution and sequential relation modeling. We further leverage the factorized bilinear model in linear complexity and a bottleneck network design to build our TB blocks, which also constrains the parameters and computation cost. We consider two schemes in terms of the incorporation of TB blocks and the original 2D spatial convolutions, namely wide and deep Temporal Bilinear Networks (TBN). Finally, we perform experiments on several widely adopted datasets including Kinetics, UCF101 and HMDB51. The effectiveness of our TBNs is validated by comprehensive ablation analyses and comparisons with various state-of-the-art methods.
引用
收藏
页码:8674 / 8681
页数:8
相关论文
共 50 条
  • [41] Human Action Recognition Based on a Spatio-Temporal Video Autoencoder
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (11)
  • [42] Temporal U-Nets for Video Summarization with Scene and Action Recognition
    Kwon, Heeseung
    Shim, Woohyun
    Cho, Minsu
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1541 - 1544
  • [43] Research on Temporal None Padding Network Video Action Recognition Algorithm
    Liu, Zhao
    Yang, Fan
    Si, Yazhong
    Computer Engineering and Applications, 2023, 59 (01) : 162 - 168
  • [44] EFFICIENT TEMPORAL-SPATIAL FEATURE GROUPING FOR VIDEO ACTION RECOGNITION
    Qiu, Zhikang
    Zhao, Xu
    Hu, Zhilan
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2176 - 2180
  • [45] Temporal Inception Architecture for Action Recognition with Convolutional Neural Networks
    Zhang, Wei
    Cen, Jiepeng
    Zheng, Huicheng
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3216 - 3221
  • [46] Spatio-Temporal Attention Networks for Action Recognition and Detection
    Li, Jun
    Liu, Xianglong
    Zhang, Wenxuan
    Zhang, Mingyuan
    Song, Jingkuan
    Sebe, Nicu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2990 - 3001
  • [47] Temporal Transformer Networks With Self-Supervision for Action Recognition
    Zhang, Yongkang
    Li, Jun
    Jiang, Na
    Wu, Guoming
    Zhang, Han
    Shi, Zhiping
    Liu, Zhaoxun
    Wu, Zizhang
    Liu, Xianglong
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (14) : 12999 - 13011
  • [48] Action Recognition Based on Spatial Temporal Graph Convolutional Networks
    Zheng, Wanqiang
    Jing, Punan
    Xu, Qingyang
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [49] Temporal Segment Networks Based on Feature Propagation for Action Recognition
    Shi Y.
    Zeng Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (04): : 582 - 589
  • [50] Action recognition using attention-based spatio-temporal VLAD networks and adaptive video sequences optimization
    Weng, Zhengkui
    Li, Xinmin
    Xiong, Shoujian
    SCIENTIFIC REPORTS, 2024, 14 (01):