Deep multiple aggregation networks for action recognition

被引:0
|
作者
Ahmed Mazari
Hichem Sahbi
机构
[1] Sorbonne University,CNRS, LIP6
关键词
Multiple aggregation design; 2-Stream networks; Action recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Most of the current action recognition algorithms are based on deep networks which stack multiple convolutional, pooling and fully connected layers. While convolutional and fully connected operations have been widely studied in the literature, the design of pooling operations that handle action recognition, with different sources of temporal granularity in action categories, has comparatively received less attention, and existing solutions rely mainly on max or averaging operations. The latter are clearly powerless to fully exhibit the actual temporal granularity of action categories and thereby constitute a bottleneck in classification performances. In this paper, we introduce a novel hierarchical pooling design that captures different levels of temporal granularity in action recognition. Our design principle is coarse-to-fine and achieved using a tree-structured network; as we traverse this network top-down, pooling operations are getting less invariant but timely more resolute and well localized. Learning the combination of operations in this network—which best fits a given ground-truth—is obtained by solving a constrained minimization problem whose solution corresponds to the distribution of weights that capture the contribution of each level (and thereby temporal granularity) in the global hierarchical pooling process. Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resolution agnostic. Extensive experiments conducted on the challenging UCF-101, HMDB-51 and JHMDB-21 databases corroborate all these statements.
引用
收藏
相关论文
共 50 条
  • [1] Deep multiple aggregation networks for action recognition
    Mazari, Ahmed
    Sahbi, Hichem
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (01)
  • [2] Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition
    Wang, Pichao
    Li, Wanqing
    Wan, Jun
    Ogunbona, Philip
    Liu, Xinwang
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7404 - 7411
  • [3] Action Recognition with Deep Neural Networks
    Doyran, Metehan
    Yildirim, Yigit
    Salah, Albert Ali
    [J]. 2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [4] Combining multiple deep cues for action recognition
    Ruiqi Wang
    Xinxiao Wu
    [J]. Multimedia Tools and Applications, 2019, 78 : 9933 - 9950
  • [5] Combining multiple deep cues for action recognition
    Wang, Ruiqi
    Wu, Xinxiao
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (08) : 9933 - 9950
  • [6] Human Action Recognition Using Deep Neural Networks
    Koli, Rashmi R.
    Bagban, Tanveer, I
    [J]. PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 376 - 380
  • [7] Micro-Expression Recognition Based on Multiple Aggregation Networks
    She, Wenxiang
    Lv, Zhao
    Taoi, Jianhua
    Liu, Bin
    Niu, Mingyue
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 1043 - 1047
  • [8] Action Recognition with Fusion of Multiple Graph Convolutional Networks
    Maurice, Camille
    Lerasle, Frederic
    [J]. 2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,
  • [9] Modality Distillation with Multiple Stream Networks for Action Recognition
    Garcia, Nuno C.
    Morerio, Pietro
    Murino, Vittorio
    [J]. COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 106 - 121
  • [10] Procedural Generation of Videos to Train Deep Action Recognition Networks
    Roberto de Souza, Cesar
    Gaidon, Adrien
    Cabon, Yohann
    Manuel Lopez, Antonio
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2594 - 2604