Exploring probabilistic localized video representation for human action recognition

被引:0
|
作者
Yan Song
Sheng Tang
Yan-Tao Zheng
Tat-Seng Chua
Yongdong Zhang
Shouxun Lin
机构
[1] Chinese Academy of Sciences,Laboratory of Advanced Computing Research, Institute of Computing Technology
[2] Graduate University of the Chinese Academy of Sciences,School of Computing
[3] Institute for Infocomm Research,undefined
[4] A*STAR,undefined
[5] National University of Singapore,undefined
来源
关键词
Human action recognition; Probabilistic video representation; Information-theoretic video matching;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, the bag-of-words (BoW) video representations have achieved promising results in human action recognition in videos. By vector quantizing local spatial temporal (ST) features, the BoW video representation brings in simplicity and efficiency, but limitations too. First, the discretization of feature space in BoW inevitably results in ambiguity and information loss in video representation. Second, there exists no universal codebook for BoW representation. The codebook needs to be re-built when video corpus is changed. To tackle these issues, this paper explores a localized, continuous and probabilistic video representation. Specifically, the proposed representation encodes the visual and motion information of an ensemble of local ST features of a video into a distribution estimated by a generative probabilistic model. Furthermore, the probabilistic video representation naturally gives rise to an information-theoretic distance metric of videos. This makes the representation readily applicable to most discriminative classifiers, such as the nearest neighbor schemes and the kernel based classifiers. Experiments on two datasets, KTH and UCF sports, show that the proposed approach could deliver promising results.
引用
收藏
页码:663 / 685
页数:22
相关论文
共 50 条
  • [1] Exploring probabilistic localized video representation for human action recognition
    Song, Yan
    Tang, Sheng
    Zheng, Yan-Tao
    Chua, Tat-Seng
    Zhang, Yongdong
    Lin, Shouxun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 58 (03) : 663 - 685
  • [2] Exploring Multimodal Video Representation for Action Recognition
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1924 - 1931
  • [3] Localized Temporal Representation in Human Action Recognition
    Han, Pang Ying
    Yee, Khor Ean
    Yin, Ooi Shih
    [J]. PROCEEDINGS OF 2018 VII INTERNATIONAL CONFERENCE ON NETWORK, COMMUNICATION AND COMPUTING (ICNCC 2018), 2018, : 261 - 266
  • [4] A DISTRIBUTION BASED VIDEO REPRESENTATION FOR HUMAN ACTION RECOGNITION
    Song, Yan
    Tang, Sheng
    Zheng, Yan-Tao
    Chua, Tat-Seng
    Zhang, Yongdong
    Lin, Shouxun
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 772 - 777
  • [5] ACTLETS: A NOVEL LOCAL REPRESENTATION FOR HUMAN ACTION RECOGNITION IN VIDEO
    Ullah, Muhammad Muneeb
    Laptev, Ivan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 777 - 780
  • [6] Probabilistic human recognition from video
    Zhou, SH
    Chellappa, R
    [J]. COMPUTER VISION - ECCV 2002 PT III, 2002, 2352 : 681 - 697
  • [7] Learning hierarchical video representation for action recognition
    Li Q.
    Qiu Z.
    Yao T.
    Mei T.
    Rui Y.
    Luo J.
    [J]. International Journal of Multimedia Information Retrieval, 2017, 6 (1) : 85 - 98
  • [8] A Robust and Efficient Video Representation for Action Recognition
    Heng Wang
    Dan Oneata
    Jakob Verbeek
    Cordelia Schmid
    [J]. International Journal of Computer Vision, 2016, 119 : 219 - 238
  • [9] A Robust and Efficient Video Representation for Action Recognition
    Wang, Heng
    Oneata, Dan
    Verbeek, Jakob
    Schmid, Cordelia
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 119 (03) : 219 - 238
  • [10] Human Action Recognition in Video
    Singh, Dushyant Kumar
    [J]. ADVANCED INFORMATICS FOR COMPUTING RESEARCH, ICAICR 2018, PT I, 2019, 955 : 54 - 66