Learning Bag of Spatio-Temporal Features for Human Interaction Recognition

被引:3
|
作者
Slimani, Khadidja Nour El Houda [1 ]
Benezeth, Yannick [2 ]
Souami, Feryel [1 ]
机构
[1] Univ Sci & Technol Houari Boumediene, LRIA, BP 32 El Alia, Algiers 16111, Algeria
[2] Univ Burgundy Franche Comte, ImViA EA 7535, F-21000 Dijon, France
来源
TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019) | 2020年 / 11433卷
关键词
Human interaction; Edge-based region; MSER; Bag of Visual Words; 3D-SIFT; Sum of Histograms; SVM; VIDEOS;
D O I
10.1117/12.2559268
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Bag of Visual Words Model (BoVW) has achieved impressive performance on human activity recognition. However, it is extremely difficult to capture high-level semantic meanings behind video features with this method as the spatiotemporal distribution of visual words is ignored, preventing localization of the interactions within a video. In this paper, we propose a supervised learning framework that automatically recognizes high-level human interaction based on a bag of spatiotemporal visual features. At first, a representative baseline keyframe that captures the major body parts of the interacting persons is selected and the bounding boxes containing persons are extracted to parse the poses of all persons in the interaction. Based on this keyframe, features are detected by combining edge features and Maximally Stable Extremal Regions (MSER) features for each interacting person and backward-forward tracked over the entire video sequence. Based on feature tracks, 3D XYT spatiotemporal volumes are generated for each interacting target. Then, the K-means algorithm is used to build a codebook of visual features to represent a given interaction. The interaction is then represented by the sum of the frequency occurrence of visual words between persons. Extensive experimental evaluations on the UT-interaction dataset demonstrate the strength of our method to recognize the ongoing interactions from videos with a simple implementation.
引用
收藏
页数:8
相关论文
共 50 条
  • [11] Convolutional Learning of Spatio-temporal Features
    Taylor, Graham W.
    Fergus, Rob
    LeCun, Yann
    Bregler, Christoph
    COMPUTER VISION - ECCV 2010, PT VI, 2010, 6316 : 140 - 153
  • [12] Learning spatio-temporal features for action recognition from the side of the video
    Pei, Lishen
    Ye, Mao
    Zhao, Xuezhuan
    Xiang, Tao
    Li, Tao
    SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 199 - 206
  • [13] Learning to Represent Spatio-Temporal Features for Fine Grained Action Recognition
    Sakhalkar, Kaustubh
    Bremond, Francois
    2018 IEEE THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2018, : 268 - 272
  • [14] Learning spatio-temporal features for action recognition from the side of the video
    Lishen Pei
    Mao Ye
    Xuezhuan Zhao
    Tao Xiang
    Tao Li
    Signal, Image and Video Processing, 2016, 10 : 199 - 206
  • [15] Learning Dynamic Spatio-Temporal Relations for Human Activity Recognition
    Liu, Zhenyu
    Yao, Yaqiang
    Liu, Yan
    Zhu, Yuening
    Tao, Zhenchao
    Wang, Lei
    Feng, Yuhong
    IEEE ACCESS, 2020, 8 : 130340 - 130352
  • [16] LEARNING A HIERARCHICAL SPATIO-TEMPORAL MODEL FOR HUMAN ACTIVITY RECOGNITION
    Xu, Wanru
    Miao, Zhenjiang
    Zhang, Xiao-Ping
    Tian, Yi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 1607 - 1611
  • [17] Study of human action recognition based on improved spatio-temporal features
    Ji X.-F.
    Wu Q.-Q.
    Ju Z.-J.
    Wang Y.-Y.
    International Journal of Automation and Computing, 2014, 11 (05) : 500 - 509
  • [18] A fast human action recognition network based on spatio-temporal features
    Xu, Jie
    Song, Rui
    Wei, Haoliang
    Guo, Jinhong
    Zhou, Yifei
    Huang, Xiwei
    NEUROCOMPUTING, 2021, 441 : 350 - 358
  • [19] Study of Human Action Recognition Based on Improved Spatio-temporal Features
    XiaoFei Ji
    QianQian Wu
    ZhaoJie Ju
    YangYang Wang
    International Journal of Automation & Computing, 2014, 11 (05) : 500 - 509
  • [20] Study of Human Action Recognition Based on Improved Spatio-temporal Features
    Xiao-Fei Ji
    Qian-Qian Wu
    Zhao-Jie Ju
    Yang-Yang Wang
    International Journal of Automation and Computing, 2014, (05) : 500 - 509