Pairwise-Covariance Multi-view Discriminant Analysis for Robust Cross-View Human Action Recognition

被引：3

作者：

Tran, Hoang-Nhat ^{[1
]}

Nguyen, Hong-Quan ^{[1
,2
]}

Doan, Huong-Giang ^{[3
]}

Tran, Thanh-Hai ^{[1
,4
]}

Le, Thi-Lan ^{[1
,4
]}

Vu, Hai ^{[1
,4
]}

机构：

[1] Hanoi Univ Sci & Technol, Internat Res Inst MICA, Hanoi 10000, Vietnam

[2] Viet Hung Univ, Fac Informat Technol, Dept Informat Technol, Hanoi 10000, Vietnam

[3] Elect Power Univ, Fac Control & Automat, Dept Measurement Engn, Hanoi 10000, Vietnam

[4] Hanoi Univ Sci & Technol, Sch Elect & Telecommun, Hanoi 10000, Vietnam

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Feature extraction; Training; Three-dimensional displays; Neural networks; Cameras; Deep learning; Correlation; Multi-view analysis; action recognition; deep learning; cross-view recognition;

D O I：

10.1109/ACCESS.2021.3082142

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human action recognition (HAR) under different camera viewpoints is the most critical requirement for practical deployment. In this paper, we propose a novel method that leverages successful deep learning-based features for action representation and multi-view analysis to accomplish robust HAR under viewpoint changes. Specifically, we investigate various deep learning techniques, from 2D CNNs to 3D CNNs to capture spatial and temporal characteristics of actions at each separated camera view. A common feature space is then constructed to keep view-invariant features among extracted streams. This is carried out by learning a set of linear transformations that project private features into the common space in which the classes are well distinguished from each other. To this end, we first adopt Multi-view Discriminant Analysis (MvDA). The original MvDA suffers from odd situations in which the most class-discrepant common space could not be found because its objective is overly concentrated on pushing classes from the global mean but unaware of the distance between specific pairs of adjoining classes. We then introduce a pairwise-covariance maximizing extension that takes pairwise distances between classes into account, namely pc-MvDA. The novel method also differs in the way that could be more favorably applied for large high-dimensional multi-view datasets. Extensive experimental results on four datasets (IXMAS, MuHAVi, MICAGes, NTU RGB+D) show that pc-MvDA achieves consistent performance gain, especially for harder classes. The code is publicly available for research purpose at https://github.com/inspiros/pcmvda.

引用

页码：76097 / 76111

页数：15

共 50 条

[31] View-invariant human action recognition via robust locally adaptive multi-view learning
Feng, Jia-geng
Xiao, Jun
[J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (11) : 917 - 929
[32] Unpaired Multi-View Graph Clustering With Cross-View Structure Matching
Wen, Yi
Wang, Siwei
Liao, Qing
Liang, Weixuan
Liang, Ke
Wan, Xinhang
Liu, Xinwang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 15
[33] Cross-view multi-layer perceptron for incomplete multi-view learning
Wang, Zhi
Zhou, Heng
Zhong, Ping
Zou, Hui
[J]. APPLIED SOFT COMPUTING, 2024, 157
[34] View incremental decremental multi-view discriminant analysis
Saroj S. Shivagunde
V. Vijaya Saradhi
[J]. Applied Intelligence, 2023, 53 : 13593 - 13607
[35] View incremental decremental multi-view discriminant analysis
Shivagunde, Saroj S.
Saradhi, V. Vijaya
[J]. APPLIED INTELLIGENCE, 2023, 53 (11) : 13593 - 13607
[36] CGD: Multi-View Clustering via Cross-View Graph Diffusion
Tang, Chang
Liu, Xinwang
Zhu, Xinzhong
Zhu, En
Luo, Zhigang
Wang, Lizhe
Gao, Wen
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5924 - 5931
[37] Cross-view Action Modeling, Learning and Recognition
Wang, Jiang
Nie, Xiaohan
Xia, Yin
Wu, Ying
Zhu, Song-Chun
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2649 - 2656
[38] GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering
Yan, Weiqing
Zhang, Yuanyang
Lv, Chenlei
Tang, Chang
Yue, Guanghui
Liao, Liang
Lin, Weisi
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19863 - 19872
[39] Projected cross-view learning for unbalanced incomplete multi-view clustering
Cai, Yiran
Che, Hangjun
Pan, Baicheng
Leung, Man-Fai
Liu, Cheng
Wen, Shiping
[J]. INFORMATION FUSION, 2024, 105
[40] Adaptive filtering for cross-view prediction in multi-view video coding
Lai, Polin
Su, Yeping
Gomila, Cristina
Ortega, Antonio
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2007, PTS 1 AND 2, 2007, 6508

← 1 2 3 4 5 →