Discovering Motion Primitives for Unsupervised Grouping and One-Shot Learning of Human Actions, Gestures, and Expressions

被引：86

作者：

Yang, Yang ^{[1
]}

Saleemi, Imran ^{[1
]}

Shah, Mubarak ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Elect Engn & Comp Sci EECS, Comp Vis Lab, Orlando, FL 32816 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2013年 / 35卷 / 07期

关键词：

Human actions; one-shot learning; unsupervised clustering; gestures; facial expressions; action representation; action recognition; motion primitives; motion patterns; histogram of motion primitives; motion primitives strings; Hidden Markov model; RECOGNITION;

D O I：

10.1109/TPAMI.2012.253

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one-shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.

引用

页码：1635 / 1648

页数：14

共 32 条

[31] Dual Meta-Learning with Longitudinally Generalized Regularization for One-Shot Brain Tissue Segmentation Across the Human Lifespan
Sun, Yongheng
Wang, Fan
Shu, Jun
Wang, Haifeng
Wang, Li
Meng, Deyu
Lian, Chunfeng
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21061 - 21071
[32] Environment-Robust Device-Free Human Activity Recognition With Channel-State-Information Enhancement and One-Shot Learning
Shi, Zhenguo
Zhang, J. Andrew
Xu, Richard Yida
Cheng, Qingqing
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (02) : 540 - 554

← 1 2 3 4 →