Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引：15

作者：

Wang, Yancheng ^{[1
]}

Xiao, Yang ^{[1
]}

Lu, Junyi ^{[1
]}

Tan, Bo ^{[1
]}

Cao, Zhiguo ^{[1
]}

Zhang, Zhenjun ^{[2
]}

Zhou, Joey Tianyi ^{[3
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China

[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China

[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2022年 / 33卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;

D O I：

10.1109/TNNLS.2021.3070179

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.

引用

页码：5332 / 5345

页数：14

共 50 条

[21] Cross-View Action Recognition via View Knowledge Transfer
Liu, Jingen
Shah, Mubarak
Kuipers, Benjamin
Savarese, Silvio
[J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,
[22] Cross-View Gait Recognition Using View-Dependent Discriminative Analysis
Mansur, Al
Makihara, Yasushi
Muramatsu, Daigo
Yagi, Yasushi
[J]. 2014 IEEE/IAPR INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2014), 2014,
[23] Multi-view Discriminant Analysis with Tensor Representation and Its Application to Cross-view Gait Recognition
Makihara, Yasushi
Al Mansur
Muramatsu, Daigo
Uddin, Zasim
Yagi, Yasushi
[J]. 2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1, 2015,
[24] DISCRIMINATIVE MULTI-VIEW FEATURE SELECTION AND FUSION
Liu, Yanbin
Liao, Binbing
Han, Yahong
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
[25] View Synthesis with Scene Recognition for Cross-View Image Localization
Lee, Uddom
Jiang, Peng
Wu, Hongyi
Xin, Chunsheng
[J]. FUTURE INTERNET, 2023, 15 (04):
[26] Unpaired Multi-View Graph Clustering With Cross-View Structure Matching
Wen, Yi
Wang, Siwei
Liao, Qing
Liang, Weixuan
Liang, Ke
Wan, Xinhang
Liu, Xinwang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 15
[27] Cross-view multi-layer perceptron for incomplete multi-view learning
Wang, Zhi
Zhou, Heng
Zhong, Ping
Zou, Hui
[J]. APPLIED SOFT COMPUTING, 2024, 157
[28] Cross-view Action Modeling, Learning and Recognition
Wang, Jiang
Nie, Xiaohan
Xia, Yin
Wu, Ying
Zhu, Song-Chun
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2649 - 2656
[29] Projected cross-view learning for unbalanced incomplete multi-view clustering
Cai, Yiran
Che, Hangjun
Pan, Baicheng
Leung, Man-Fai
Liu, Cheng
Wen, Shiping
[J]. INFORMATION FUSION, 2024, 105
[30] Adaptive filtering for cross-view prediction in multi-view video coding
Lai, Polin
Su, Yeping
Gomila, Cristina
Ortega, Antonio
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2007, PTS 1 AND 2, 2007, 6508

← 1 2 3 4 5 →