Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos

被引：1

作者：

Wu, Yutang ^{[1
,2
]}

Wang, Hanli ^{[1
,2
]}

Li, Qinyu ^{[1
,3
]}

机构：

[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China

[2] Tongji Univ, Key Lab Embedded Syst & Serv Comp, Minist Educ, Shanghai 200092, Peoples R China

[3] Lanzhou City Univ, Dept Comp Sci, Lanzhou 730070, Gansu, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II | 2018年 / 11165卷

基金：

中国国家自然科学基金;

关键词：

Human action detection; Three-stream architecture; Action tubelet detector; Pose stream;

D O I：

10.1007/978-3-030-00767-6_28

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, human action detection in videos has gained wide attention. Instead of detection frame by frame, a model named action tubelet (ACT) detector detects human actions sequence by sequence and achieves remarkable performances on both accuracy and speed in the form of two streams. In this work, a three-stream action tubelet detector (three-stream ACT detector) is proposed which adds an extra pose stream to obtain more information about human actions and fuses three streams by weighted average compared to the two-stream architecture. The experimental results on the benchmark UCF-Sports, J-HMDB and UCF-101 datasets demonstrate that the proposed threestream ACT detector framework is able to boost the performance of human action detection.

引用

页码：296 / 306

页数：11

共 50 条

[41] Anomaly Detection for Spatiotemporal Data in Action
Yang, Guang
Kulkarni, Ninad
Dua, Paavani
Khullar, Dipika
Chirayath, Alex Anto
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4844 - 4845
[42] Two-Stream Convolutional Networks for Action Recognition in Videos
Simonyan, Karen
Zisserman, Andrew
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[43] 3 s-STNet: three-stream spatial-temporal network with appearance and skeleton information learning for action recognition
Fang, Ming
Peng, Siyu
Zhao, Yang
Yuan, Haibo
Hung, Chih-Cheng
Liu, Shuhua
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02): : 1835 - 1848
[44] COWO: towards real-time spatiotemporal action localization in videos
Yi, Yang
Sun, Yang
Yuan, Saimei
Zhu, Yiji
Zhang, Mengyi
Zhu, Wenjun
ASSEMBLY AUTOMATION, 2022, 42 (02) : 202 - 208
[45] A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition
Chen, Enqing
Bai, Xue
Gao, Lei
Tinega, Haron Chweya
Ding, Yingqiang
IEEE ACCESS, 2019, 7 : 57267 - 57275
[46] Two-stream spatiotemporal networks for skeleton action recognition
Wang, Lei
Zhang, Jianwei
Yang, Shanmin
Gu, Song
IET IMAGE PROCESSING, 2023, 17 (11) : 3358 - 3370
[47] Three-stream interaction decoder network for RGB-thermal salient object detection
Huo, Fushuo
Zhu, Xuegui
Li, Bingheng
KNOWLEDGE-BASED SYSTEMS, 2022, 258
[48] Three-stream network with context convolution module for human-object interaction detection
Siadari, Thomhert S.
Han, Mikyong
Yoon, Hyunjin
ETRI JOURNAL, 2020, 42 (02) : 230 - 238
[49] Spatiotemporal Deformable Part Models for Action Detection
Tian, Yicong
Sukthankar, Rahul
Shah, Mubarak
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2642 - 2649
[50] Online human action detection and anticipation in videos: A survey
Hu, Xuejiao
Dai, Jingzhao
Li, Ming
Peng, Chenglei
Li, Yang
Du, Sidan
Neurocomputing, 2022, 491 : 395 - 413

← 1 2 3 4 5 →