Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引:15
|
作者
Wang, Yancheng [1 ]
Xiao, Yang [1 ]
Lu, Junyi [1 ]
Tan, Bo [1 ]
Cao, Zhiguo [1 ]
Zhang, Zhenjun [2 ]
Zhou, Joey Tianyi [3 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China
[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;
D O I
10.1109/TNNLS.2021.3070179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.
引用
收藏
页码:5332 / 5345
页数:14
相关论文
共 50 条
  • [1] Multi-View Gait Image Generation for Cross-View Gait Recognition
    Chen, Xin
    Luo, Xizhao
    Weng, Jian
    Luo, Weiqi
    Li, Huiting
    Tian, Qi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3041 - 3055
  • [2] Mining Discriminative 3D Poselet for Cross-view Action Recognition
    Wang, Jiang
    Nie, Xiaohan
    Xia, Yin
    Wu, Ying
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 634 - 639
  • [3] Discriminative Virtual Views for Cross-View Action Recognition
    Li, Ruonan
    Zickler, Todd
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2855 - 2862
  • [4] Bidirectional Fusion With Cross-View Graph Filter for Multi-View Clustering
    Yang X.
    Zhu T.
    Wu D.
    Wang P.
    Liu Y.
    Nie F.
    IEEE Transactions on Knowledge and Data Engineering, 2024, 36 (11) : 1 - 6
  • [5] 3-D Dynamic Multitarget Detection Algorithm Based on Cross-View Feature Fusion
    Zhou F.
    Tao C.
    Gao Z.
    Zhang Z.
    Zheng S.
    Zhu Y.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (06): : 3146 - 3159
  • [6] Pairwise-Covariance Multi-view Discriminant Analysis for Robust Cross-View Human Action Recognition
    Tran, Hoang-Nhat
    Nguyen, Hong-Quan
    Doan, Huong-Giang
    Tran, Thanh-Hai
    Le, Thi-Lan
    Vu, Hai
    IEEE ACCESS, 2021, 9 : 76097 - 76111
  • [7] Multi-view Deep Network for Cross-view Classification
    Kan, Meina
    Shan, Shiguang
    Chen, Xilin
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4847 - 4855
  • [8] Multi-View Latent Variable Discriminative Models For Action Recognition
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2120 - 2127
  • [9] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [10] Cross-View Cross-Scene Multi-View Crowd Counting
    Zhang, Qi
    Lin, Wei
    Chan, Antoni B.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 557 - 567