Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos

被引:1
|
作者
Wu, Yutang [1 ,2 ]
Wang, Hanli [1 ,2 ]
Li, Qinyu [1 ,3 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Tongji Univ, Key Lab Embedded Syst & Serv Comp, Minist Educ, Shanghai 200092, Peoples R China
[3] Lanzhou City Univ, Dept Comp Sci, Lanzhou 730070, Gansu, Peoples R China
基金
中国国家自然科学基金;
关键词
Human action detection; Three-stream architecture; Action tubelet detector; Pose stream;
D O I
10.1007/978-3-030-00767-6_28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, human action detection in videos has gained wide attention. Instead of detection frame by frame, a model named action tubelet (ACT) detector detects human actions sequence by sequence and achieves remarkable performances on both accuracy and speed in the form of two streams. In this work, a three-stream action tubelet detector (three-stream ACT detector) is proposed which adds an extra pose stream to obtain more information about human actions and fuses three streams by weighted average compared to the two-stream architecture. The experimental results on the benchmark UCF-Sports, J-HMDB and UCF-101 datasets demonstrate that the proposed threestream ACT detector framework is able to boost the performance of human action detection.
引用
收藏
页码:296 / 306
页数:11
相关论文
共 50 条
  • [31] Action Progression Networks for Temporal Action Detection in Videos
    Lu, Chong-Kai
    Mak, Man-Wai
    Li, Ruimin
    Chi, Zheru
    Fu, Hong
    IEEE ACCESS, 2024, 12 : 126829 - 126844
  • [32] Three-stream spatio-temporal attention network for first-person action and interaction recognition
    Javed Imran
    Balasubramanian Raman
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 1137 - 1152
  • [33] Active Learning of an Action Detector from Untrimmed Videos
    Bandla, Sunil
    Grauman, Kristen
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1833 - 1840
  • [34] Augmented two stream network for robust action recognition adaptive to various action videos
    Leng, Chuanjiang
    Ding, Qichuan
    Wu, Chengdong
    Chen, Ange
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 81
  • [35] Three-Stream Convolutional Neural Network for Depression Detection With Ocular Imaging
    Yang, Minqiang
    Weng, Ziru
    Zhang, Yuhong
    Tao, Yongfeng
    Hu, Bin
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 4921 - 4930
  • [36] Three-Stream Convolutional Neural Network with Multi-task and Ensemble Learning for 3D Action Recognition
    Liang, Duohan
    Fan, Guoliang
    Lin, Guangfeng
    Chen, Wanjun
    Pan, Xiaorong
    Zhu, Hong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 934 - 940
  • [37] 3 s-STNet: three-stream spatial–temporal network with appearance and skeleton information learning for action recognition
    Ming Fang
    Siyu Peng
    Yang Zhao
    Haibo Yuan
    Chih-Cheng Hung
    Shuhua Liu
    Neural Computing and Applications, 2023, 35 : 1835 - 1848
  • [38] Learning Spatiotemporal-Selected Representations in Videos for Action Recognition
    Zhang, Jiachao
    Tong, Ying
    Jiao, Liangbao
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (12)
  • [39] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
    Fang Ren
    Chao Tang
    Anyang Tong
    Wenjian Wang
    Multimedia Tools and Applications, 2024, 83 : 6273 - 6295
  • [40] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
    Ren, Fang
    Tang, Chao
    Tong, Anyang
    Wang, Wenjian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 6273 - 6295