Augmented two stream network for robust action recognition adaptive to various action videos

被引：7

作者：

Leng, Chuanjiang ^{[1
]}

Ding, Qichuan ^{[1
]}

Wu, Chengdong ^{[1
]}

Chen, Ange ^{[1
]}

机构：

[1] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110169, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2021年 / 81卷

基金：

中国国家自然科学基金;

关键词：

Two-stream network; Action recognition; Data skew;

D O I：

10.1016/j.jvcir.2021.103344

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In video-based action recognition, using videos with different frame numbers to train a two-stream network can result in data skew problems. Moreover, extracting the key frames from a video is crucial for improving the training and recognition efficiency of action recognition systems. However, previous works suffer from problems of information loss and optical-flow interference when handling videos with different frame numbers. In this paper, an augmented two-stream network (ATSNet) is proposed to achieve robust action recognition. A frame-number-unified strategy is first incorporated into the temporal stream network to unify the frame numbers of videos. Subsequently, the grayscale statistics of the optical-flow images are extracted to filter out any invalid optical-flow images and produce the dynamic fusion weights for the two branch networks to adapt to different action videos. Experiments conducted on the UCF101 dataset demonstrate that ATSNet outperforms previously defined methods, improving the recognition accuracy by 1.13%.

引用

页数：8

共 50 条

[41] CTC Network with Statistical Language Modeling for Action Sequence Recognition in Videos
Lin, Mengxi
Inoue, Nakamasa
Shinoda, Koichi
PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 393 - 401
[42] Deep ChaosNet for Action Recognition in Videos
Chen, Huafeng
Zhang, Maosheng
Gao, Zhengming
Zhao, Yunhong
COMPLEXITY, 2021, 2021
[43] Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos
Du, Wenbin
Wang, Yali
Qiao, Yu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1347 - 1360
[44] FEATURE SPACE DATA AUGMENTATION FOR VIEWPOINT-ROBUST ACTION RECOGNITION IN VIDEOS
Geara, Carla
Setkov, Aleksandr
Orcesi, Astrid
Luvison, Bertrand
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 585 - 589
[45] ACTION RECOGNITION IN UNCONSTRAINED AMATEUR VIDEOS
Liu, Jingen
Luo, Jiebo
Shah, Mubarak
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3549 - +
[46] Structured Learning for Action Recognition in Videos
Long, Yinghan
Srinivasan, Gopalakrishnan
Panda, Priyadarshini
Roy, Kaushik
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (03) : 475 - 484
[47] Group Action Recognition in Soccer Videos
Kong, Yu
Zhan, Xiaoqin
Wei, Qingdi
Hu, Weiming
Jia, Yunde
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 249 - +
[48] Accelerated action recognition and segmentation in videos
Ghodhbani, Emna
Mefteh, Ahmed
Benazza-Benyahia, Amel
2020 10TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC), 2021,
[49] Tifar-net: three-stream inception former-based action recognition network for infrared videos
Imran, Javed
Rajput, Amitesh Singh
Vashisht, Rohit
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (02)
[50] An Improved Attention-Based Spatiotemporal-Stream Model for Action Recognition in Videos
Liu, Dan
Ji, Yunfeng
Ye, Mao
Gan, Yan
Zhang, Jianwei
IEEE ACCESS, 2020, 8 : 61462 - 61470

← 1 2 3 4 5 →