Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引:15
|
作者
Wang, Yancheng [1 ]
Xiao, Yang [1 ]
Lu, Junyi [1 ]
Tan, Bo [1 ]
Cao, Zhiguo [1 ]
Zhang, Zhenjun [2 ]
Zhou, Joey Tianyi [3 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China
[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;
D O I
10.1109/TNNLS.2021.3070179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.
引用
收藏
页码:5332 / 5345
页数:14
相关论文
共 50 条
  • [1] Multi-View Gait Image Generation for Cross-View Gait Recognition
    Chen, Xin
    Luo, Xizhao
    Weng, Jian
    Luo, Weiqi
    Li, Huiting
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3041 - 3055
  • [2] Mining Discriminative 3D Poselet for Cross-view Action Recognition
    Wang, Jiang
    Nie, Xiaohan
    Xia, Yin
    Wu, Ying
    [J]. 2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 634 - 639
  • [3] Discriminative Virtual Views for Cross-View Action Recognition
    Li, Ruonan
    Zickler, Todd
    [J]. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2855 - 2862
  • [4] Pairwise-Covariance Multi-view Discriminant Analysis for Robust Cross-View Human Action Recognition
    Tran, Hoang-Nhat
    Nguyen, Hong-Quan
    Doan, Huong-Giang
    Tran, Thanh-Hai
    Le, Thi-Lan
    Vu, Hai
    [J]. IEEE ACCESS, 2021, 9 : 76097 - 76111
  • [5] Multi-view Deep Network for Cross-view Classification
    Kan, Meina
    Shan, Shiguang
    Chen, Xilin
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4847 - 4855
  • [6] Multi-View Latent Variable Discriminative Models For Action Recognition
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    [J]. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2120 - 2127
  • [7] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [8] Cross-View Cross-Scene Multi-View Crowd Counting
    Zhang, Qi
    Lin, Wei
    Chan, Antoni B.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 557 - 567
  • [9] Dynamic View Aggregation for Multi-View 3D Shape Recognition
    Zhou, Yuan
    Sun, Zhongqi
    Huo, Shuwei
    Kung, Sun-Yuan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9163 - 9174
  • [10] Cross-view graph matching for incomplete multi-view clustering
    Yang, Jing-Hua
    Fu, Le-Le
    Chen, Chuan
    Dai, Hong-Ning
    Zheng, Zibin
    [J]. NEUROCOMPUTING, 2023, 515 : 79 - 88