Multi-View Super Vector for Action Recognition

被引:135
|
作者
Cai, Zhuowei [1 ]
Wang, Limin [1 ,2 ]
Peng, Xiaojiang [1 ]
Qiao, Yu [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen Key Lab Comp Vis & Pattern Recognit, Beijing, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR.2014.83
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.
引用
收藏
页码:596 / 603
页数:8
相关论文
共 50 条
  • [31] MULTI-VIEW DESCRIPTOR MINING VIA CODEWORD NET FOR ACTION RECOGNITION
    Liu, Jingyu
    Huang, Yongzhen
    Peng, Xiaojiang
    Wang, Liang
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 793 - 797
  • [32] Joint Transferable Dictionary Learning and View Adaptation for Multi-view Human Action Recognition
    Sun, Bin
    Kong, Dehui
    Wang, Shaofan
    Wang, Lichun
    Yin, Baocai
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2021, 15 (02)
  • [33] Cross-modality online distillation for multi-view action recognition
    Xu, Chao
    Wu, Xia
    Li, Yachun
    Jin, Yining
    Wang, Mengmeng
    Liu, Yong
    NEUROCOMPUTING, 2021, 456 : 384 - 393
  • [34] Feature Extraction and Representation for Distributed Multi-View Human Action Recognition
    Luo, Jiajia
    Wang, Wei
    Qi, Hairong
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2013, 3 (02) : 145 - 154
  • [35] Simultaneous Action Recognition and Localization Based on Multi-View Hough Voting
    Hara, Kensho
    Hirayama, Takatsugu
    Mase, Kenji
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 616 - 620
  • [36] Fine-grained action recognition using multi-view attentions
    Yisheng Zhu
    Guangcan Liu
    The Visual Computer, 2020, 36 : 1771 - 1781
  • [37] Conflux LSTMs Network: A Novel Approach for Multi-View Action Recognition
    Ullah, Amin
    Muhammad, Khan
    Hussain, Tanveer
    Baik, Sung Wook
    NEUROCOMPUTING, 2021, 435 : 321 - 329
  • [38] Human action recognition using multi-view image sequences features
    Ahmad, Mohiuddin
    Lee, Seong-Whan
    PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION - PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE, 2006, : 523 - +
  • [39] Silhouette-Based Multi-View Human Action Recognition in Video
    Aryanfar, Alihossein
    Yaakob, Razali
    Halin, Alfian Abdul
    Sulaiman, Md Nasir
    Kasmiran, Khairul Azhar
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND TECHNOLOGY (ICCST), 2014,
  • [40] MULTI-VIEW FUSION FOR ACTION RECOGNITION IN CHILD-ROBOT INTERACTION
    Efthymiou, Niki
    Koutras, Petros
    Filntisis, Panagiotis Paraskevas
    Potamianos, Gerasimos
    Maragos, Petros
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 455 - 459