Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

被引:14
|
作者
Li, Tianjiao [1 ]
Foo, Lin Geng [1 ]
Ke, Qiuhong [2 ]
Rahmani, Hossein [3 ]
Wang, Anran [4 ]
Wang, Jinghua [5 ]
Liu, Jun [1 ]
机构
[1] Singapore Univ Technol & Design, ISTD Pillar, Singapore, Singapore
[2] Monash Univ, Dept Data Sci & AI, Melbourne, Vic, Australia
[3] Univ Lancaster, Sch Comp & Commun, Lancaster, England
[4] ByteDance, Beijing, Peoples R China
[5] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
来源
基金
新加坡国家研究基金会;
关键词
Action recognition; Fine-grained; Dynamic neural networks; HUMAN NEURAL SYSTEM; FACE; REPRESENTATIONS; IDENTITY;
D O I
10.1007/978-3-031-19772-7_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of fine-grained action recognition is to successfully discriminate between action categories with subtle differences. To tackle this, we derive inspiration from the human visual system which contains specialized regions in the brain that are dedicated towards handling specific tasks. We design a novel Dynamic Spatio-Temporal Specialization (DSTS) module, which consists of specialized neurons that are only activated for a subset of samples that are highly similar. During training, the loss forces the specialized neurons to learn discriminative fine-grained differences to distinguish between these similar samples, improving fine-grained recognition. Moreover, a spatio-temporal specialization method further optimizes the architectures of the specialized neurons to capture either more spatial or temporal fine-grained information, to better tackle the large range of spatio-temporal variations in the videos. Lastly, we design an Upstream-Downstream Learning algorithm to optimize our model's dynamic decisions during training, improving the performance of our DSTS module. We obtain state-of-the-art performance on two widely-used fine-grained action recognition datasets.
引用
收藏
页码:386 / 403
页数:18
相关论文
共 50 条
  • [21] Spatio-Temporal Contrastive Learning for Compositional Action Recognition
    Gong, Yezi
    Pei, Mingtao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 424 - 438
  • [22] Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network
    Zhou, Xuan
    Yi, Jianping
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2103 - 2116
  • [23] Temporal and Fine-Grained Pedestrian Action Recognition on Driving Recorder Database
    Kataoka, Hirokatsu
    Satoh, Yutaka
    Aoki, Yoshimitsu
    Oikawa, Shoko
    Matsui, Yasuhiro
    SENSORS, 2018, 18 (02)
  • [24] Modeling fine-grained spatio-temporal pollution maps with low-cost sensors
    Iyer, Shiva R.
    Balashankar, Ananth
    Aeberhard, William H.
    Bhattacharyya, Sujoy
    Rusconi, Giuditta
    Jose, Lejo
    Soans, Nita
    Sudarshan, Anant
    Pande, Rohini
    Subramanian, Lakshminarayanan
    NPJ CLIMATE AND ATMOSPHERIC SCIENCE, 2022, 5 (01)
  • [25] Fine-Grained Vessel Traffic Flow Prediction With a Spatio-Temporal Multigraph Convolutional Network
    Liang, Maohan
    Liu, Ryan Wen
    Zhan, Yang
    Li, Huanhuan
    Zhu, Fenghua
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 23694 - 23707
  • [26] Modeling fine-grained spatio-temporal pollution maps with low-cost sensors
    Shiva R. Iyer
    Ananth Balashankar
    William H. Aeberhard
    Sujoy Bhattacharyya
    Giuditta Rusconi
    Lejo Jose
    Nita Soans
    Anant Sudarshan
    Rohini Pande
    Lakshminarayanan Subramanian
    npj Climate and Atmospheric Science, 5
  • [27] Action Recognition Using a Spatio-Temporal Model in Dynamic Scenes
    Chathuramali, K. G. Manosha
    Rodrigo, Ranga
    2014 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS), 2014,
  • [28] Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks Application to table tennis
    Martin, Pierre-Etienne
    Benois-Pineau, Jenny
    Peteri, Renaud
    Morlier, Julien
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 20429 - 20447
  • [29] Accelerated Learning of Discriminative Spatio-temporal Features for Action Recognition
    Varshney, Munender
    Rameshan, Renu
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [30] Supervised Spatio-Temporal Neighborhood Topology Learning for Action Recognition
    Ma, Andy J.
    Yuen, Pong C.
    Zou, Wilman W. W.
    Lai, Jian-Huang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (08) : 1447 - 1460