Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引:15
|
作者
Wang, Yancheng [1 ]
Xiao, Yang [1 ]
Lu, Junyi [1 ]
Tan, Bo [1 ]
Cao, Zhiguo [1 ]
Zhang, Zhenjun [2 ]
Zhou, Joey Tianyi [3 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China
[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;
D O I
10.1109/TNNLS.2021.3070179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.
引用
收藏
页码:5332 / 5345
页数:14
相关论文
共 50 条
  • [31] Adaptive filtering for cross-view prediction in multi-view video coding
    Lai, Polin
    Su, Yeping
    Gomila, Cristina
    Ortega, Antonio
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2007, PTS 1 AND 2, 2007, 6508
  • [32] GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering
    Yan, Weiqing
    Zhang, Yuanyang
    Lv, Chenlei
    Tang, Chang
    Yue, Guanghui
    Liao, Liang
    Lin, Weisi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19863 - 19872
  • [33] Multi-view common component discriminant analysis for cross-view classification
    You, Xinge
    Xu, Jiamiao
    Yuan, Wei
    Jing, Xiao-Yuan
    Tao, Dacheng
    Zhang, Taiping
    PATTERN RECOGNITION, 2019, 92 : 37 - 51
  • [34] Multi-View and Multi-Modal Action Recognition with Learned Fusion
    Ardianto, Sandy
    Hang, Hsueh-Ming
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1601 - 1604
  • [35] Towards Metric Fusion on Multi-view Data: A Cross-view based Graph Random Walk Approach
    Wang, Yang
    Lin, Xuemin
    Zhang, Qing
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 805 - 810
  • [36] Multi-view image coding using 3-D voxel models
    Gao, YY
    Radha, H
    2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1677 - 1680
  • [37] Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition
    Takemura N.
    Makihara Y.
    Muramatsu D.
    Echigo T.
    Yagi Y.
    IPSJ Transactions on Computer Vision and Applications, 2018, 10 (01)
  • [38] Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video
    Yang, You
    Liu, Qiong
    He, Xin
    Liu, Zhen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 302 - 315
  • [39] Multi-View Action Recognition by Cross-domain Learning
    Nie, Weizhi
    Liu, Anan
    Yu, Jing
    Su, Yuting
    Chaisorn, Lekha
    Wang, Yongkang
    Kankanhalli, Mohan S.
    2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [40] Multi-view Analysis of Unregistered Medical Images Using Cross-View Transformers
    van Tulder, Gijs
    Tong, Yao
    Marchiori, Elena
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT III, 2021, 12903 : 104 - 113