Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos

被引：1

作者：

Wu, Yutang ^{[1
,2
]}

Wang, Hanli ^{[1
,2
]}

Li, Qinyu ^{[1
,3
]}

机构：

[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China

[2] Tongji Univ, Key Lab Embedded Syst & Serv Comp, Minist Educ, Shanghai 200092, Peoples R China

[3] Lanzhou City Univ, Dept Comp Sci, Lanzhou 730070, Gansu, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II | 2018年 / 11165卷

基金：

中国国家自然科学基金;

关键词：

Human action detection; Three-stream architecture; Action tubelet detector; Pose stream;

D O I：

10.1007/978-3-030-00767-6_28

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, human action detection in videos has gained wide attention. Instead of detection frame by frame, a model named action tubelet (ACT) detector detects human actions sequence by sequence and achieves remarkable performances on both accuracy and speed in the form of two streams. In this work, a three-stream action tubelet detector (three-stream ACT detector) is proposed which adds an extra pose stream to obtain more information about human actions and fuses three streams by weighted average compared to the two-stream architecture. The experimental results on the benchmark UCF-Sports, J-HMDB and UCF-101 datasets demonstrate that the proposed threestream ACT detector framework is able to boost the performance of human action detection.

引用

页码：296 / 306

页数：11

共 50 条

[1] Discriminative action tubelet detector for weakly-supervised action detection
Lee, Jiyoung
Kim, Seungryong
Kim, Sunok
Sohn, Kwanghoon
PATTERN RECOGNITION, 2024, 155
[2] Three-stream CNNs for action recognition
Wang, Liangliang
Ge, Lianzheng
Li, Ruifeng
Fang, Yajun
PATTERN RECOGNITION LETTERS, 2017, 92 : 33 - 40
[3] Beyond Two-stream: Skeleton-based Three-stream Networks for Action Recognition in Videos
Xu, Jianfeng
Tasaka, Kazuyuki
Yanagihara, Hiromasa
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1567 - 1573
[4] ENHANCED ACTION TUBELET DETECTOR FOR SPATIO-TEMPORAL VIDEO ACTION DETECTION
Wu, Yutang
Wang, Hanli
Wang, Shuheng
Li, Qinyu
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2388 - 2392
[5] Action Tubelet Detector for Spatio-Temporal Action Localization
Kalogeiton, Vicky
Weinzaepfel, Philippe
Ferrari, Vittorio
Schmid, Cordelia
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4415 - 4423
[6] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos
Purwanto, Didik
Pramono, Rizard Renanda Adhi
Chen, Yie-Tarng
Fang, Wen-Hsien
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (08) : 1187 - 1191
[7] Multi-Modal Three-Stream Network for Action Recognition
Khalid, Muhammad Usman
Yu, Jie
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3210 - 3215
[8] Tifar-net: three-stream inception former-based action recognition network for infrared videos
Imran, Javed
Rajput, Amitesh Singh
Vashisht, Rohit
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (02)
[9] Trajectory-aware three-stream CNN for video action recognition
Weng, Zhengkui
Guan, Yepeng
JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
[10] Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN
Shi, Yemin
Tian, Yonghong
Wang, Yaowei
Huang, Tiejun
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (07) : 1510 - 1520

← 1 2 3 4 5 →