Discriminative Segment Focus Network for Fine-grained Video Action Recognition

被引:0
|
作者
Sun, Baoli [1 ]
Ye, Xinchen [2 ]
Yan, Tiantian [3 ]
Wang, Zhihui [2 ]
Li, Haojie [4 ]
Wang, Zhiyong [5 ]
机构
[1] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
[2] Dalian Univ Technol, DUT RU Int Sch Informat Sci & Engn, Dalian, Liaoning, Peoples R China
[3] Dalian Univ, Natl & Local Joint Engn Lab Comp Aided Design, Dalian, Liaoning, Peoples R China
[4] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Shandong, Peoples R China
[5] Univ Sydney, Sch Informat Technol, Sydney, NSW, Australia
关键词
Fine-grained action recognition; discriminative segment; correlation;
D O I
10.1145/3654671
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-grained video action recognition aims at identifying minor and discriminative variations among fine categories of actions. While many recent action recognition methods have been proposed to better model spatio-temporal representations, how to model the interactions among discriminative atomic actions to effectively characterize inter-class and intra-class variations has been neglected, which is vital for understanding fine-grained actions. In this work, we devise a Discriminative Segment Focus Network (DSFNet) to mine the discriminability of segment correlations and localize discriminative action-relevant segments for fine-grained video action recognition. Firstly, we propose a hierarchic correlation reasoning (HCR) module which explicitly establishes correlations between different segments at multiple temporal scales and enhances each segment by exploiting the correlations with other segments. Secondly, a discriminative segment focus (DSF) module is devised to localize the most action-relevant segments fromthe enhanced representations of HCR by enforcing the consistency between the discriminability and the classification confidence of a given segment with a consistency constraint. Finally, these localized segment representations are combined with the global action representation of the whole video for boosting final recognition. Extensive experimental results on two fine-grained action recognition datasets, i.e., FineGym and Diving48, and two action recognition datasets, i.e., Kinetics400 and Something-Something, demonstrate the effectiveness of our approach compared with the state-of-the-art methods.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] JOINT LEARNING ON THE HIERARCHY REPRESENTATION FOR FINE-GRAINED HUMAN ACTION RECOGNITION
    Leong, Mei Chee
    Tan, Hui Li
    Zhang, Haosong
    Li, Liyuan
    Lin, Feng
    Lim, Joo Hwee
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1059 - 1063
  • [42] Fine-grained action recognition using multi-view attentions
    Yisheng Zhu
    Guangcan Liu
    [J]. The Visual Computer, 2020, 36 : 1771 - 1781
  • [43] Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
    Munro, Jonathan
    Damen, Dima
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 119 - 129
  • [44] Human Action Recognition Using Deep Data: A Fine-Grained Study
    Rao, D. Surendra
    Potturu, Sudharsana Rao
    Bhagyaraju, V
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 97 - 108
  • [45] Fine-grained action recognition using multi-view attentions
    Zhu, Yisheng
    Liu, Guangcan
    [J]. VISUAL COMPUTER, 2020, 36 (09): : 1771 - 1781
  • [46] Temporal and Fine-Grained Pedestrian Action Recognition on Driving Recorder Database
    Kataoka, Hirokatsu
    Satoh, Yutaka
    Aoki, Yoshimitsu
    Oikawa, Shoko
    Matsui, Yasuhiro
    [J]. SENSORS, 2018, 18 (02)
  • [47] Which and How Many Regions to Gaze: Focus Discriminative Regions for Fine-Grained Visual Categorization
    He, Xiangteng
    Peng, Yuxin
    Zhao, Junjie
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (09) : 1235 - 1255
  • [48] Fine-grained Action Recognition with Robust Motion Representation Decoupling and Concentration
    Sun, Baoli
    Ye, Xinchen
    Yan, Tiantian
    Wang, Zhihui
    Li, Haojie
    Wang, Zhiyong
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4779 - 4788
  • [49] Multi-Modal Domain Adaptation for Fine-grained Action Recognition
    Munro, Jonathan
    Damen, Dima
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3723 - 3726
  • [50] Fine-grained action recognition of boxing punches from depth imagery
    Kasiri, Soudeh
    Fookes, Clinton
    Sridharan, Sridha
    Morgan, Stuart
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 159 : 143 - 153