A Multimodal Fusion Approach for Human Activity Recognition

被引:8
|
作者
Koutrintzes, Dimitrios [1 ]
Spyrou, Evaggelos [2 ]
Mathe, Eirini [3 ]
Mylonas, Phivos [3 ]
机构
[1] Natl Ctr Sci Res Demokritos, Inst Informat & Telecommun, Athens, Greece
[2] Univ Thessaly, Dept Informat & Telecommun, Lamia, Greece
[3] Ionian Univ, Dept Informat, Corfu, Greece
关键词
Human activity recognition; multimodal fusion; deep convolutional neural networks;
D O I
10.1142/S0129065723500028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of video-based HAR is proposed. It is based on 3D visual data that are collected using an RGB+depth camera, resulting to both raw video and 3D skeletal sequences. These data are transformed into six different 2D image representations; four of them are in the spectral domain, another is a pseudo-colored image. The aforementioned representations are based on skeletal data. The last representation is a "dynamic " image which is actually an artificially created image that summarizes RGB data of the whole video sequence, in a visually comprehensible way. In order to classify a given activity video, first, all the aforementioned 2D images are extracted and then six trained convolutional neural networks are used so as to extract visual features. The latter are fused so as to form a single feature vector and are fed into a support vector machine for classification into human activities. For evaluation purposes, a challenging motion activity recognition dataset is used, while single-view, cross-view and cross-subject experiments are performed. Moreover, the proposed approach is compared to three other state-of-the-art methods, demonstrating superior performance in most experiments.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Human-centric multimodal fusion network for robust action recognition
    Hu, Zesheng
    Xiao, Jian
    Li, Le
    Liu, Cun
    Ji, Genlin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 239
  • [32] Marfusion: An Attention-Based Multimodal Fusion Model for Human Activity Recognition in Real-World Scenarios
    Zhao, Yunhan
    Guo, Siqi
    Chen, Zeqi
    Shen, Qiang
    Meng, Zhengyuan
    Xu, Hao
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (11):
  • [33] A Dual Pipeline With Spatio-Temporal Attention Fusion Approach for Human Activity Recognition
    Wang, Xiaodong
    Li, Ying
    Fang, Aiqing
    He, Pei
    Guo, Yangming
    [J]. IEEE SENSORS JOURNAL, 2024, 24 (15) : 25150 - 25162
  • [34] Multimodal fusion recognition for digital twin
    Zhou, Tianzhe
    Zhang, Xuguang
    Kang, Bing
    Chen, Mingkai
    [J]. DIGITAL COMMUNICATIONS AND NETWORKS, 2024, 10 (02) : 337 - 346
  • [35] Multimodal data fusion for object recognition
    Knyaz, Vladimir
    [J]. MULTIMODAL SENSING: TECHNOLOGIES AND APPLICATIONS, 2019, 11059
  • [36] Multimodal fusion recognition for digital twin
    Tianzhe Zhou
    Xuguang Zhang
    Bing Kang
    Mingkai Chen
    [J]. Digital Communications and Networks, 2024, 10 (02) - 346
  • [37] Fusion Mappings for Multimodal Affect Recognition
    Kaechele, Markus
    Schels, Martin
    Thiam, Patrick
    Schwenker, Friedhelm
    [J]. 2015 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2015, : 307 - 313
  • [38] Multimodal Emotion Recognition Using Feature Fusion: An LLM-Based Approach
    Chandraumakantham, Omkumar
    Gowtham, N.
    Zakariah, Mohammed
    Almazyad, Abdulaziz
    [J]. IEEE ACCESS, 2024, 12 : 108052 - 108071
  • [39] HYBRID FUSION BASED APPROACH FOR MULTIMODAL EMOTION RECOGNITION WITH INSUFFICIENT LABELED DATA
    Kumar, Puneet
    Khokher, Vedanti
    Gupta, Yukti
    Raman, Balasubramanian
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 314 - 318
  • [40] Speech emotion recognition using multimodal feature fusion with machine learning approach
    Sandeep Kumar Panda
    Ajay Kumar Jena
    Mohit Ranjan Panda
    Susmita Panda
    [J]. Multimedia Tools and Applications, 2023, 82 : 42763 - 42781