SRI3D: Two-stream inflated 3D ConvNet based on sparse regularization for action recognition

被引:2
|
作者
Yang, Zhaoqilin [1 ]
An, Gaoyun [1 ,3 ]
Zhang, Ruichen [2 ]
Zheng, Zhenxing [1 ]
Ruan, Qiuqi [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
[3] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; convolutional neural nets; neural nets; video signal processing; NETWORKS;
D O I
10.1049/ipr2.12725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although most state-of-the-art action recognition models have adopted a two-stream 3D convolutional structure as a backbone network, few works have studied the impact of loss functions on action recognition models. In addition, sparsity is used as a key prior knowledge in many fields. However, as far as is known, no one has studied the influence of the sparsity of network output on the output of deep learning-based action recognition models. Therefore, this paper proposes a novel two-stream inflated 3D ConvNet based on the sparse regularization (SRI3D) model for action recognition. In order to allow the network to learn the sparsity of output, the l(1) norm is embedded in the loss function in regularization form in a plug-and-play manner. It can make the classification result after the fusion of the two-stream network only be the category with the highest confidence in one of the streams and not the other cases. The proposed loss function based on sparse regularization makes the output vector of the neural network as sparse as possible so that the classification results will not be ambiguous. Experimental results show that compared with other state-of-the-art models, this SRI3D has a competitive advantage on Kinetics-400, Something-Something V2, UCF-101 and HMDB-51.
引用
收藏
页码:1438 / 1448
页数:11
相关论文
共 50 条
  • [1] An Improved Two-stream Inflated 3D ConvNet for Abnormal Behavior Detection
    Pan, Jiahui
    Liu, Liangxin
    Lin, Mianfen
    Luo, Shengzhou
    Zhou, Chengju
    Liao, Huijian
    Wang, Fei
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 30 (02): : 673 - 688
  • [2] Two-Stream 3D Convolution Attentional Network for Action Recognition
    Kusumoseniarto, Raden Hadapiningsyah
    2020 JOINT 9TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2020 4TH INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2020,
  • [3] Two-Stream RNN/CNN for Action Recognition in 3D Videos
    Zhao, Rui
    Ali, Haider
    van der Smagt, Patrick
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4260 - 4267
  • [4] 3D Convolutional Two-Stream Network for Action Recognition in Videos
    Li, Min
    Qi, Yuezhu
    Yang, Jian
    Zhang, Yanfang
    Ren, Junxing
    Du, Hong
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1697 - 1701
  • [5] Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length
    Wang, Xuanhan
    Gao, Lianli
    Wang, Peng
    Sun, Xiaoshuai
    Liu, Xianglong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (03) : 634 - 644
  • [6] An Efficient Multi-Scale Attention two-stream inflated 3D ConvNet network for cattle behavior
    Yang, Jucheng
    Jia, Qingxiang
    Han, Shujie
    Du, Zihan
    Liu, Jianzheng
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 232
  • [7] Pose-Guided Inflated 3D ConvNet for action recognition in videos
    Wu, Qianyu
    Zhu, Aichun
    Cui, Ran
    Wang, Tian
    Hu, Fangqiang
    Bao, Yaping
    Snoussi, Hichem
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 91
  • [8] Kinematics Features for 3D Action Recognition Using Two-Stream CNN
    Wang, Jiangliu
    Liu, Yunhui
    2018 13TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2018, : 1731 - 1736
  • [9] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
    Chen, Jun
    Xu, Yuanping
    Zhang, Chaolong
    Xu, Zhijie
    Meng, Xiangxiang
    Wang, Jie
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140
  • [10] Improving human action recognition with two-stream 3D convolutional neural network
    Van-Minh Khong
    Thanh-Hai Tran
    2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2018,