View knowledge transfer network for multi-view action recognition

被引:9
|
作者
Liang, Zixi [1 ]
Yin, Ming [1 ]
Gao, Junli [1 ]
He, Yicheng [1 ]
Huang, Weitian [2 ]
机构
[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
[2] South China Univ Technol, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Deep learning; Multi-view learning; Generative adversarial network; Late fusion;
D O I
10.1016/j.imavis.2021.104357
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As many data in practical applications occur or can be captured in multiple views form, multi-view action recognition has received much attention recently, due to utilizing certain complementary and heterogeneous information in various views to promote the downstream task. However, most existing methods assume that multi-view data is complete, which may not always be met in real-world applications.To this end, in this paper, a novel View Knowledge Transfer Network (VKTNet) is proposed to handle multi-view action recognition, even when some views are incomplete. Specifically, the view knowledge transferring is utilized using conditional generative adversarial network(cGAN) to reproduce each view's latent representation, conditioning on the other view's information. As such, the high-level semantic features are effectively extracted to bridge the semantic gap between two different views. In addition, in order to efficiently fuse the decision result achieved by each view, a Siamese Scaling Network(SSN) is proposed instead of simply using a classifier. Experimental results show that our model achieves the superiority performance, on three public datasets, against others when all the views are available. Meanwhile, the degradation of performance is avoided under the case that some views are missing. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [2] Dividing and Aggregating Network for Multi-view Action Recognition
    Wang, Dongang
    Ouyang, Wanli
    Li, Wen
    Xu, Dong
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 457 - 473
  • [3] Action Recognition with a Multi-View Temporal Attention Network
    Dengdi Sun
    Zhixiang Su
    Zhuanlian Ding
    Bin Luo
    [J]. Cognitive Computation, 2022, 14 : 1082 - 1095
  • [4] Action Recognition with a Multi-View Temporal Attention Network
    Sun, Dengdi
    Su, Zhixiang
    Ding, Zhuanlian
    Luo, Bin
    [J]. COGNITIVE COMPUTATION, 2022, 14 (03) : 1082 - 1095
  • [5] DVANet: Disentangling View and Action Features for Multi-View Action Recognition
    Siddiqui, Nyle
    Tirupattur, Praveen
    Shah, Mubarak
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4873 - 4881
  • [6] Generative Multi-View Human Action Recognition
    Wang, Lichen
    Ding, Zhengming
    Tao, Zhiqiang
    Liu, Yunyu
    Fu, Yun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6221 - 6230
  • [7] Multi-view human action recognition: A survey
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. 2013 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2013), 2013, : 522 - 525
  • [8] Continuous Multi-View Human Action Recognition
    Wang, Qiang
    Sun, Gan
    Dong, Jiahua
    Wang, Qianqian
    Ding, Zhengming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3603 - 3614
  • [9] Multi-View Super Vector for Action Recognition
    Cai, Zhuowei
    Wang, Limin
    Peng, Xiaojiang
    Qiao, Yu
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 596 - 603
  • [10] Conflux LSTMs Network: A Novel Approach for Multi-View Action Recognition
    Ullah, Amin
    Muhammad, Khan
    Hussain, Tanveer
    Baik, Sung Wook
    [J]. NEUROCOMPUTING, 2021, 435 : 321 - 329