Early vs Late Fusion in Multimodal Convolutional Neural Networks

被引:0
|
作者
Gadzicki, Konrad [1 ]
Khamsehashari, Razieh [1 ]
Zetzsche, Christoph [1 ]
机构
[1] Univ Bremen, Cognit Neuroinformat, Bremen, Germany
关键词
Multi-layer neural network; Activity recognition; Sensor fusion;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Combining machine learning in neural networks with multimodal fusion strategies offers an interesting potential for classification tasks but the optimum fusion strategies for many applications have yet to be determined. Here we address this issue in the context of human activity recognition, making use of a state-of-the-art convolutional network architecture (Inception I3D) and a huge dataset (NTU RGB+D). As modalities we consider RGB video, optical flow, and skeleton data. We determine whether the fusion of different modalities can provide an advantage as compared to uni-modal approaches, and whether a more complex early fusion strategy can outperform the simpler late-fusion strategy by making use of statistical correlations between the different modalities. Our results show a clear performance improvement by multi-modal fusion and a substantial advantage of an early fusion strategy.
引用
下载
收藏
页码:292 / 297
页数:6
相关论文
共 50 条
  • [1] Multimodal MRI Volumetric Data Fusion With Convolutional Neural Networks
    Liu, Yu
    Shi, Yu
    Mu, Fuhao
    Cheng, Juan
    Li, Chang
    Chen, Xun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [2] Early and Late Level Fusion of Deep Convolutional Neural Networks for Visual Concept Recognition
    Ergun, Hilal
    Akyuz, Yusuf Caglar
    Sert, Mustafa
    Liu, Jianquan
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2016, 10 (03) : 379 - 397
  • [3] Convolutional Neural Networks and Multimodal Fusion for Text Aided Image Classification
    Wang, Dongzhe
    Mao, Kezhi
    Ng, Gee-Wah
    2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2017, : 1063 - 1069
  • [4] Late fusion of multimodal deep neural networks for weeds classification
    Vo Hoang Trong
    Gwang-hyun, Yu
    Dang Thanh Vu
    Jin-young, Kim
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 175
  • [5] Multi-channel speech enhancement using early and late fusion convolutional neural networks
    S. Siva Priyanka
    T. Kishore Kumar
    Signal, Image and Video Processing, 2023, 17 : 973 - 979
  • [6] Multi-channel speech enhancement using early and late fusion convolutional neural networks
    Priyanka, S. Siva
    Kumar, T. Kishore
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 973 - 979
  • [7] TWO-PHASE MULTIMODAL IMAGE FUSION USING CONVOLUTIONAL NEURAL NETWORKS
    Kusram, Kushal
    Transue, Shane
    Choi, Min-Hyung
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1874 - 1878
  • [8] Multimodal medical image fusion based on interval gradients and convolutional neural networks
    Gu, Xiaolong
    Xia, Ying
    Zhang, Jie
    BMC MEDICAL IMAGING, 2024, 24 (01):
  • [9] Early vs. Late Multimodal Fusion for Recognizing Confusion in Collaborative Tasks
    Ashwath, Anisha
    Peechatt, Michael
    Alm, Cecilia
    Bailey, Reynold
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [10] Multimodal Multiresolution Data Fusion Using Convolutional Neural Networks for IoT Wearable Sensing
    John, Arlene
    Nundy, Koushik Kumar
    Cardiff, Barry
    John, Deepu
    IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2021, 15 (06) : 1161 - 1173