Multi-View Super Vector for Action Recognition

被引:135
|
作者
Cai, Zhuowei [1 ]
Wang, Limin [1 ,2 ]
Peng, Xiaojiang [1 ]
Qiao, Yu [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen Key Lab Comp Vis & Pattern Recognit, Beijing, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR.2014.83
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.
引用
收藏
页码:596 / 603
页数:8
相关论文
共 50 条
  • [21] Multi-View Action Recognition by Cross-domain Learning
    Nie, Weizhi
    Liu, Anan
    Yu, Jing
    Su, Yuting
    Chaisorn, Lekha
    Wang, Yongkang
    Kankanhalli, Mohan S.
    2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [22] Jointly Learning Multi-view Features for Human Action Recognition
    Wang, Ruoshi
    Liu, Zhigang
    Yin, Ziyang
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4858 - 4861
  • [23] Multi-View Latent Variable Discriminative Models For Action Recognition
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2120 - 2127
  • [24] Unsupervised video segmentation for multi-view daily action recognition
    Liu, Zhigang
    Wu, Yin
    Yin, Ziyang
    Gao, Chunlei
    IMAGE AND VISION COMPUTING, 2023, 134
  • [25] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [26] Support vector machine based multi-view face detection and recognition
    Li, YM
    Gong, SG
    Sherrah, J
    Liddell, H
    IMAGE AND VISION COMPUTING, 2004, 22 (05) : 413 - 427
  • [27] MULTI-TASK LINEAR DISCRIMINANT ANALYSIS FOR MULTI-VIEW ACTION RECOGNITION
    Yan, Yan
    Liu, Gaowen
    Ricci, Elisa
    Sebe, Nicu
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2837 - 2841
  • [28] VIEW-INDEPENDENT HUMAN ACTION RECOGNITION BASED ON MULTI-VIEW ACTION IMAGES AND DISCRIMINANT LEARNING
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    2013 IEEE 11TH IVMSP WORKSHOP: 3D IMAGE/VIDEO TECHNOLOGIES AND APPLICATIONS (IVMSP 2013), 2013,
  • [29] Multi-view Regularized Extreme Learning Machine for Human Action Recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 84 - 94
  • [30] Fine-grained action recognition using multi-view attentions
    Zhu, Yisheng
    Liu, Guangcan
    VISUAL COMPUTER, 2020, 36 (09): : 1771 - 1781