A deep multimodal network based on bottleneck layer features fusion for action recognition

被引：9

作者：

Singh, Tej ^{[1
]}

Vishwakarma, Dinesh Kumar ^{[1
]}

机构：

[1] Delhi Technol Univ, Dept Informat Technol, Biometr Res Lab, Delhi 110042, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2021年 / 80卷 / 24期

关键词：

Human Activity Recognition (HAR); Deep learning; DCA; SVM; LEVEL FUSION; SKELETON; DEPTH; REPRESENTATION; INFORMATION; DESCRIPTOR; JOINTS;

D O I：

10.1007/s11042-021-11415-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human Activity Recognition (HAR) in videos using convolution neural network become the preferred choice for researcher due to the tremendous success of deep learning models for visual recognition applications. After the invention of the low-cost depth sensor, multiple modalities based activity recognition systems were successfully developed in the past decade. Although it is always challenging to recognize the complex human activities in videos. In this work, we proposed a deep bottleneck multimodal feature fusion (D-BMFF) framework that fused three different modalities of RGB, RGB-D(depth) and 3D coordinates information for activity classification. It helps to better recognize and make full use of information available simultaneously from a depth sensor. During the training process RGB and depth, frames are fed at regular intervals for an activity video while 3D coordinates are first converted into single RGB skeleton motion history image (RGB-SklMHI). We have extracted the features from multimodal data inputs using the latest deep pre-trained network architecture. The multimodal feature obtained from bottleneck layers before the top layer is fused by using multiset discriminant correlation analysis (M-DCA), which allows for robust visual action modeling. Finally, using a linear multiclass support vector machine (SVM) method, the fused features are categorized. The proposed approach is evaluated over four standard RGB-D datasets: UT-Kinect, CAD-60, Florence 3D and SBU Interaction. Our framework produces outstanding results and outperformed the state-of-the-art methods.

引用

页码：33505 / 33525

页数：21

共 50 条

[1] A deep multimodal network based on bottleneck layer features fusion for action recognition
Tej Singh
Dinesh Kumar Vishwakarma
[J]. Multimedia Tools and Applications, 2021, 80 : 33505 - 33525
[2] Deep learning network model based on fusion of spatiotemporal features for action recognition
Ge Yang
Wu-xing Zou
[J]. Multimedia Tools and Applications, 2022, 81 : 9875 - 9896
[3] Deep learning network model based on fusion of spatiotemporal features for action recognition
Yang, Ge
Zou, Wu-xing
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (07) : 9875 - 9896
[4] Deep Neural Network Bottleneck Features for Acoustic Event Recognition
Mun, Seongkyu
Shon, Suwon
Kim, Wooil
Ko, Hanseok
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2954 - 2957
[5] Diverse Features Fusion Network for video-based action recognition
Deng, Haoyang
Kong, Jun
Jiang, Min
Liu, Tianshan
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 77
[6] Hybrid features for skeleton-based action recognition based on network fusion
Chen, Zhangmeng
Pan, Junjun
Yang, Xiaosong
Qin, Hong
[J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2020, 31 (4-5)
[7] Finger Multimodal Features Fusion and Recognition Based on CNN
Wang, Li
Zhang, Haigang
Yang, Jingfeng
[J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 3183 - 3188
[8] A Deep Reinforcement Learning Method For Multimodal Data Fusion in Action Recognition
Guo, Jiale
Liu, Qiang
Chen, Enqing
[J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 120 - 124
[9] Human Action Recognition Based on Fusion Features
Yang, Shiqiang
Yang, Jiangtao
Li, Fei
Fan, Guohao
Li, Dexin
[J]. CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 569 - 579
[10] Facial Expression Recognition Based on Fusion of Local Features and Deep Belief Network
Wang Linlin
Liu Jinghao
Fu Xiaomei
[J]. LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (01)

← 1 2 3 4 5 →