Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

被引:1909
|
作者
Wang, Limin [1 ]
Xiong, Yuanjun [2 ]
Wang, Zhe [3 ]
Qiao, Yu [3 ]
Lin, Dahua [2 ]
Tang, Xiaoou [2 ]
Van Gool, Luc [1 ]
机构
[1] ETH, Comp Vis Lab, Zurich, Switzerland
[2] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Hong Kong, Peoples R China
[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
来源
关键词
Action recognition; Temporal segment networks; Good practices; ConvNets;
D O I
10.1007/978-3-319-46484-8_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 (69.4%) and UCF101 (94.2%). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices
引用
收藏
页码:20 / 36
页数:17
相关论文
共 50 条
  • [1] Temporal Segment Networks for Action Recognition in Videos
    Wang, Limin
    Xiong, Yuanjun
    Wang, Zhe
    Qiao, Yu
    Lin, Dahua
    Tang, Xiaoou
    Van Gool, Luc
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (11) : 2740 - 2755
  • [2] PYSKL: Towards Good Practices for Skeleton Action Recognition
    Duan, Haodong
    Wang, Jiaqi
    Chen, Kai
    Lin, Dahua
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7351 - 7354
  • [3] Temporal Segment Networks Based on Feature Propagation for Action Recognition
    Shi, Yuexiang
    Zeng, Zhichao
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (04): : 582 - 589
  • [4] Temporal segment graph convolutional networks for skeleton-based action recognition
    Ding, Chongyang
    Wen, Shan
    Ding, Wenwen
    Liu, Kai
    Belyaev, Evgeny
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [5] Sequential Segment Networks for Action Recognition
    Chen, Quan-Qi
    Zhang, Yu-Jin
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (05) : 712 - 716
  • [6] Temporal Segment Connection Network for Action Recognition
    Li, Qian
    Yang, Wenzhu
    Chen, Xiangyang
    Yuan, Tongtong
    Wang, Yuxia
    [J]. IEEE ACCESS, 2020, 8 : 179118 - 179127
  • [7] Towards Good Practices for Action Video Encoding
    Wu, Jianxin
    Zhang, Yu
    Lin, Weiyao
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2577 - 2584
  • [8] Temporal Action Detection with Structured Segment Networks
    Zhao, Yue
    Xiong, Yuanjun
    Wang, Limin
    Wu, Zhirong
    Tang, Xiaoou
    Lin, Dahua
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2933 - 2942
  • [9] Temporal Action Detection with Structured Segment Networks
    Yue Zhao
    Yuanjun Xiong
    Limin Wang
    Zhirong Wu
    Xiaoou Tang
    Dahua Lin
    [J]. International Journal of Computer Vision, 2020, 128 : 74 - 95
  • [10] Temporal Action Detection with Structured Segment Networks
    Zhao, Yue
    Xiong, Yuanjun
    Wang, Limin
    Wu, Zhirong
    Tang, Xiaoou
    Lin, Dahua
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (01) : 74 - 95