Multi-modality Fusion Network for Action Recognition

被引:0
|
作者
Huang, Kai [1 ]
Qin, Zheng [1 ]
Xu, Kaiping [1 ]
Ye, Shuxiong [1 ]
Wang, Guolong [1 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing, Peoples R China
关键词
Action recognition; 2D ConvNets; 3D ConvNets; Fusion; HISTOGRAMS;
D O I
10.1007/978-3-319-77383-4_14
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks have outperformed many traditional methods for action recognition on video datasets, such as UCF101 and HMDB51. This paper aims to explore the performance of fusion of different convolutional networks with different dimensions. The main contribution of this work is multi-modality fusion network (MMFN), a novel framework for action recognition, which combines 2D ConvNets and 3D ConvNets. The accuracy of MMFN outperforms the state-of-the-art deep-learning-based methods on the datasets of UCF101 (94.6%) and HMDB51 (69.7%).
引用
收藏
页码:139 / 149
页数:11
相关论文
共 50 条
  • [1] Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
    Zhu, Xiaoguang
    Zhu, Ye
    Wang, Haoyu
    Wen, Honglin
    Yan, Yan
    Liu, Peilin
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (03)
  • [2] Multi-Modality Adaptive Feature Fusion Graph Convolutional Network for Skeleton-Based Action Recognition
    Zhang, Haiping
    Zhang, Xinhao
    Yu, Dongjin
    Guan, Liming
    Wang, Dongjing
    Zhou, Fuxing
    Zhang, Wanjun
    [J]. SENSORS, 2023, 23 (12)
  • [3] Multi-modality learning for human action recognition
    Ziliang Ren
    Qieshi Zhang
    Xiangyang Gao
    Pengyi Hao
    Jun Cheng
    [J]. Multimedia Tools and Applications, 2021, 80 : 16185 - 16203
  • [4] Multi-modality learning for human action recognition
    Ren, Ziliang
    Zhang, Qieshi
    Gao, Xiangyang
    Hao, Pengyi
    Cheng, Jun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16185 - 16203
  • [5] Human Action Recognition Via Multi-modality Information
    Gao, Zan
    Song, Jian-ming
    Zhang, Hua
    Liu, An-An
    Xue, Yan-Bing
    Xu, Guang-ping
    [J]. JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2014, 9 (02) : 739 - 748
  • [6] Focal Channel Knowledge Distillation for Multi-Modality Action Recognition
    Gan, Lipeng
    Cao, Runze
    Li, Ning
    Yang, Man
    Li, Xiaochao
    [J]. IEEE ACCESS, 2023, 11 : 78285 - 78298
  • [7] Multi-modality Empowered Network for Facial Action Unit Detection
    Liu, Peng
    Zhang, Zheng
    Yang, Huiyuan
    Yin, Lijun
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 2175 - 2184
  • [8] An Encoder Generative Adversarial Network for Multi-modality Image Recognition
    Chen, Yu
    Yang, Chunling
    Zhu, Min
    Yang, ShiYan
    [J]. IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 2689 - 2694
  • [9] Space-Time Block Code Recognition Algorithm Based on Multi-Modality Features Fusion Network
    Zhang, Yu-Yuan
    Yan, Wen-Jun
    Zhang, Li-Min
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (02): : 489 - 498