Augmented two stream network for robust action recognition adaptive to various action videos

被引：7

作者：

Leng, Chuanjiang ^{[1
]}

Ding, Qichuan ^{[1
]}

Wu, Chengdong ^{[1
]}

Chen, Ange ^{[1
]}

机构：

[1] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110169, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2021年 / 81卷

基金：

中国国家自然科学基金;

关键词：

Two-stream network; Action recognition; Data skew;

D O I：

10.1016/j.jvcir.2021.103344

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In video-based action recognition, using videos with different frame numbers to train a two-stream network can result in data skew problems. Moreover, extracting the key frames from a video is crucial for improving the training and recognition efficiency of action recognition systems. However, previous works suffer from problems of information loss and optical-flow interference when handling videos with different frame numbers. In this paper, an augmented two-stream network (ATSNet) is proposed to achieve robust action recognition. A frame-number-unified strategy is first incorporated into the temporal stream network to unify the frame numbers of videos. Subsequently, the grayscale statistics of the optical-flow images are extracted to filter out any invalid optical-flow images and produce the dynamic fusion weights for the two branch networks to adapt to different action videos. Experiments conducted on the UCF101 dataset demonstrate that ATSNet outperforms previously defined methods, improving the recognition accuracy by 1.13%.

引用

页数：8

共 50 条

[1] 3D Convolutional Two-Stream Network for Action Recognition in Videos
Li, Min
Qi, Yuezhu
Yang, Jian
Zhang, Yanfang
Ren, Junxing
Du, Hong
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1697 - 1701
[2] Two-Stream Convolutional Networks for Action Recognition in Videos
Simonyan, Karen
Zisserman, Andrew
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[3] A heterogeneous two-stream network for human action recognition
Liao, Shengbin
Wang, Xiaofeng
Yang, ZongKai
AI COMMUNICATIONS, 2023, 36 (03) : 219 - 233
[4] A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition
Chen, Enqing
Bai, Xue
Gao, Lei
Tinega, Haron Chweya
Ding, Yingqiang
IEEE ACCESS, 2019, 7 : 57267 - 57275
[5] A Multimode Two-Stream Network for Egocentric Action Recognition
Li, Ying
Shen, Jie
Xiong, Xin
He, Wei
Li, Peng
Yan, Wenjie
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 357 - 368
[6] Two-Stream RNN/CNN for Action Recognition in 3D Videos
Zhao, Rui
Ali, Haider
van der Smagt, Patrick
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4260 - 4267
[7] Convolutional Two-Stream Network Fusion for Video Action Recognition
Feichtenhofer, Christoph
Pinz, Axel
Zisserman, Andrew
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
[8] Two-Stream Convolutional Neural Network for Video Action Recognition
Qiao, Han
Liu, Shuang
Xu, Qingzhen
Liu, Shouqiang
Yang, Wanggan
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (10): : 3668 - 3684
[9] Hidden Two-Stream Collaborative Learning Network for Action Recognition
Zhou, Shuren
Chen, Le
Sugumaran, Vijayan
CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 63 (03): : 1545 - 1561
[10] Two-Stream Convolution Neural Network with Video-stream for Action Recognition
Dai, Wei
Chen, Yimin
Huang, Chen
Gao, Ming-Ke
Zhang, Xinyu
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 4 5 →