Unsupervised Hierarchical Dynamic Parsing and Encoding for Action Recognition

被引:16
|
作者
Su, Bing [1 ]
Zhou, Jiahuan [2 ]
Ding, Xiaoqing [3 ]
Wu, Ying [2 ]
机构
[1] Chinese Acad Sci, Inst Software, Sci & Technol Integrated Informat Syst Lab, Beijing 100190, Peoples R China
[2] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
[3] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Action recognition; temporal clustering; hierarchical modeling; dynamic encoding; ENSEMBLE; VECTOR; MODELS; PARTS;
D O I
10.1109/TIP.2017.2745212
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generally, the evolution of an action is not uniform across the video, but exhibits quite complex rhythms and non-stationary dynamics. To model such non-uniform temporal dynamics, in this paper, we describe a novel hierarchical dynamic parsing and encoding method to capture both the locally smooth dynamics and globally drastic dynamic changes. It parses the dynamics of an action into different layers and encodes such multi-layer temporal information into a joint representation for action recognition. At the first layer, the action sequence is parsed in an unsupervised manner into several smooth-changing stages corresponding to different key poses or temporal structures by temporal clustering. The dynamics within each stage are encoded by mean-pooling or rank-pooling. At the second layer, the temporal information of the ordered dynamics extracted from the previous layer is encoded again by rank-pooling to form the overall representation. Extensive experiments on a gesture action data set (Chalearn Gesture) and three generic action data sets (Olympic Sports, Hollywood2, and UCF101) have demonstrated the effectiveness of the proposed method.
引用
收藏
页码:5784 / 5799
页数:16
相关论文
共 50 条
  • [21] An Unsupervised Framework for Action Recognition Using Actemes
    Kulkarni, Kaustubh
    Boyer, Edmond
    Horaud, Radu
    Kale, Amit
    COMPUTER VISION - ACCV 2010, PT IV, 2011, 6495 : 592 - +
  • [22] Unsupervised Universal Attribute Modeling for Action Recognition
    Roy, Debaditya
    Murty, Kodukula Sri Rama
    Mohan, Chalavadi Krishna
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (07) : 1672 - 1680
  • [23] Integrating Human Parsing and Pose Network for Human Action Recognition
    Ding, Runwei
    Wen, Yuhang
    Liu, Jinfu
    Dai, Nan
    Meng, Fanyang
    Liu, Mengyuan
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 182 - 194
  • [24] Action Recognition by Hierarchical Sequence Summarization
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3562 - 3569
  • [25] Hierarchical Motion Evolution for Action Recognition
    Wang, Hongsong
    Wang, Wei
    Wang, Liang
    PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 574 - 578
  • [26] Unsupervised morphological parsing of Bengali
    Dasgupta, Sajib
    Ng, Vincent
    LANGUAGE RESOURCES AND EVALUATION, 2006, 40 (3-4) : 311 - 330
  • [27] SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation
    Wang, Zhe
    Chen, Hao
    Li, Xinyu
    Liu, Chunhui
    Xiong, Yuanjun
    Tighe, Joseph
    Fowlkes, Charless
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 175 - 184
  • [28] Unsupervised morphological parsing of Bengali
    Sajib Dasgupta
    Vincent Ng
    Language Resources and Evaluation, 2006, 40 : 311 - 330
  • [29] Multilingual Unsupervised Dependency Parsing with Unsupervised POS Tags
    Marecek, David
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, MICAI 2015, PT I, 2015, 9413 : 72 - 82
  • [30] Human action recognition based on action relevance weighted encoding
    Yi, Yang
    Li, Ao
    Zhou, Xiaofeng
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 80