Multi-View Super Vector for Action Recognition

被引:135
|
作者
Cai, Zhuowei [1 ]
Wang, Limin [1 ,2 ]
Peng, Xiaojiang [1 ]
Qiao, Yu [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen Key Lab Comp Vis & Pattern Recognit, Beijing, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR.2014.83
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.
引用
收藏
页码:596 / 603
页数:8
相关论文
共 50 条
  • [41] Reduced-view super multi-view display
    Nakamura, Junya
    Tanaka, Kosuke
    Tsai, Chao-Hsu
    Takaki, Yasuhiro
    STEREOSCOPIC DISPLAYS AND APPLICATIONS XXII, 2011, 7863
  • [42] Action recognition for depth video using multi-view dynamic images
    Xiao, Yang
    Chen, Jun
    Wang, Yancheng
    Cao, Zhiguo
    Zhou, Joey Tianyi
    Bai, Xiang
    INFORMATION SCIENCES, 2019, 480 : 287 - 304
  • [43] MMA: a multi-view and multi-modality benchmark dataset for human action recognition
    Gao, Zan
    Han, Tao-tao
    Zhang, Hua
    Xue, Yan-bing
    Xu, Guang-ping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29383 - 29404
  • [44] A Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition
    Xu, Ning
    Liu, Anan
    Nie, Weizhi
    Wong, Yongkang
    Li, Fuwu
    Su, Yuting
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1195 - 1198
  • [45] MMA: a multi-view and multi-modality benchmark dataset for human action recognition
    Zan Gao
    Tao-tao Han
    Hua Zhang
    Yan-bing Xue
    Guang-ping Xu
    Multimedia Tools and Applications, 2018, 77 : 29383 - 29404
  • [46] A Multi-View Face Recognition System
    张永越
    彭振云
    游素亚
    徐光佑
    Journal of Computer Science & Technology, 1997, (05) : 400 - 407
  • [47] A Survey of Multi-view Gait Recognition
    Wang K.-J.
    Ding X.-N.
    Xing X.-L.
    Liu M.-C.
    Zidonghua Xuebao/Acta Automatica Sinica, 2019, 45 (05): : 841 - 852
  • [48] MULTI-VIEW NORMALIZATION FOR FACE RECOGNITION
    Tang, Chia-Hao
    Chou, Yi-Mei
    Hsu, Gee-Sera Jison
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2343 - 2347
  • [49] Multi-view face recognition system
    Zhang, Yongyue
    Peng, Zhenyun
    You, Suya
    Xu, Guangyou
    Journal of Computer Science and Technology, 1997, 12 (05): : 400 - 407
  • [50] A multi-view face recognition system
    Yongyue Zhang
    Zhenyun Peng
    Suya You
    Guangyou Xu
    Journal of Computer Science and Technology, 1997, 12 (5) : 400 - 407