Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos

被引:1
|
作者
Wu, Yutang [1 ,2 ]
Wang, Hanli [1 ,2 ]
Li, Qinyu [1 ,3 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Tongji Univ, Key Lab Embedded Syst & Serv Comp, Minist Educ, Shanghai 200092, Peoples R China
[3] Lanzhou City Univ, Dept Comp Sci, Lanzhou 730070, Gansu, Peoples R China
基金
中国国家自然科学基金;
关键词
Human action detection; Three-stream architecture; Action tubelet detector; Pose stream;
D O I
10.1007/978-3-030-00767-6_28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, human action detection in videos has gained wide attention. Instead of detection frame by frame, a model named action tubelet (ACT) detector detects human actions sequence by sequence and achieves remarkable performances on both accuracy and speed in the form of two streams. In this work, a three-stream action tubelet detector (three-stream ACT detector) is proposed which adds an extra pose stream to obtain more information about human actions and fuses three streams by weighted average compared to the two-stream architecture. The experimental results on the benchmark UCF-Sports, J-HMDB and UCF-101 datasets demonstrate that the proposed threestream ACT detector framework is able to boost the performance of human action detection.
引用
收藏
页码:296 / 306
页数:11
相关论文
共 50 条
  • [1] Discriminative action tubelet detector for weakly-supervised action detection
    Lee, Jiyoung
    Kim, Seungryong
    Kim, Sunok
    Sohn, Kwanghoon
    PATTERN RECOGNITION, 2024, 155
  • [2] Three-stream CNNs for action recognition
    Wang, Liangliang
    Ge, Lianzheng
    Li, Ruifeng
    Fang, Yajun
    PATTERN RECOGNITION LETTERS, 2017, 92 : 33 - 40
  • [3] Beyond Two-stream: Skeleton-based Three-stream Networks for Action Recognition in Videos
    Xu, Jianfeng
    Tasaka, Kazuyuki
    Yanagihara, Hiromasa
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1567 - 1573
  • [4] ENHANCED ACTION TUBELET DETECTOR FOR SPATIO-TEMPORAL VIDEO ACTION DETECTION
    Wu, Yutang
    Wang, Hanli
    Wang, Shuheng
    Li, Qinyu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2388 - 2392
  • [5] Action Tubelet Detector for Spatio-Temporal Action Localization
    Kalogeiton, Vicky
    Weinzaepfel, Philippe
    Ferrari, Vittorio
    Schmid, Cordelia
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4415 - 4423
  • [6] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos
    Purwanto, Didik
    Pramono, Rizard Renanda Adhi
    Chen, Yie-Tarng
    Fang, Wen-Hsien
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (08) : 1187 - 1191
  • [7] Multi-Modal Three-Stream Network for Action Recognition
    Khalid, Muhammad Usman
    Yu, Jie
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3210 - 3215
  • [8] Tifar-net: three-stream inception former-based action recognition network for infrared videos
    Imran, Javed
    Rajput, Amitesh Singh
    Vashisht, Rohit
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (02)
  • [9] Trajectory-aware three-stream CNN for video action recognition
    Weng, Zhengkui
    Guan, Yepeng
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [10] Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN
    Shi, Yemin
    Tian, Yonghong
    Wang, Yaowei
    Huang, Tiejun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (07) : 1510 - 1520