Multi-modal fusion method for human action recognition based on IALC

被引：2

作者：

Zhang, Yinhuan ^{[1
,2
]}

Xiao, Qinkun ^{[1
,3
]}

Liu, Xing ^{[3
]}

Wei, Yongquan ^{[4
]}

Chu, Chaoqin ^{[1
]}

Xue, Jingyun ^{[1
]}

机构：

[1] Xian Technol Univ, Sch Mechatron Engn, Xian, Peoples R China

[2] Weinan Vocat & Tech Coll, Sch Construct Engn, Weinan, Peoples R China

[3] Xian Technol Univ, Sch Elect Informat Engn, Xian 710021, Peoples R China

[4] CRRC Tangshan Co Ltd, Tangshan, Peoples R China

来源：

IET IMAGE PROCESSING | 2023年 / 17卷 / 02期

关键词：

Fusion methods - Hidden-Markov models - Human behaviors - Human-action recognition - Multi-modal - Multi-modal fusion - Performance - Recognition accuracy - Sequence features - Video sequences;

D O I：

10.1049/ipr2.12640

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In occlusion and interaction scenarios, human action recognition (HAR) accuracy is low. To address this issue, this paper proposes a novel multi-modal fusion framework for HAR. In this framework, a module called improved attention long short-term memory (IAL) is proposed, which combines the improved SE-ResNet50 (ISE-ResNet50) with long short-term memory (LSTM). IAL can extract the video sequence features and the skeleton sequence features of human behaviour. To improve the performance of HAR at a high semantic level, the obtained multi-modal sequence features are fed into a couple hidden Markov model (CHMM), and a multi-modal IAL+CHMM method called IALC is developed based on a probability graph model. To test the performance of the proposed method, experiments are conducted on the HMDB51, UCF101, Kinetics 400k, and ActivityNet datasets, and the obtained recognition accuracy are 86.40%, 97.78%, 81.12%, and 69.36% on the four datasets, respectively. The experimental results show that when the environment is complex, the proposed multi-modal fusion method for HAR based on the IALC can achieve more accurate target recognition results.

引用

页码：388 / 400

页数：13

共 50 条

[21] Vision-Based Multi-Modal Framework for Action Recognition
Romaissa, Beddiar Djamila
Mourad, Oussalah
Brahim, Nini
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5859 - 5866
[22] MULTI-MODAL FEATURE FUSION FOR ACTION RECOGNITION IN RGB-D SEQUENCES
Shahroudy, Amir
Wang, Gang
Ng, Tian-Tsong
2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 73 - 76
[23] DFN: A deep fusion network for flexible single and multi-modal action recognition
Li, Chuankun
Hou, Yonghong
Li, Wanqing
Ding, Zewei
Wang, Pichao
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
[24] Multi-Modal Multi-Action Video Recognition
Shi, Zhensheng
Liang, Ju
Li, Qianqian
Zheng, Haiyong
Gu, Zhaorui
Dong, Junyu
Zheng, Bing
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13658 - 13667
[25] Modality Mixer for Multi-modal Action Recognition
Lee, Sumin
Woo, Sangmin
Park, Yeonju
Nugroho, Muhammad Adi
Kim, Changick
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3297 - 3306
[26] Multi-modal Perception Fusion Method Based on Cross Attention
Zhang B.-L.
Pan Z.-H.
Jiang J.-Z.
Zhang C.-B.
Wang Y.-X.
Yang C.-L.
Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2024, 37 (03): : 181 - 193
[27] Visual Sorting Method Based on Multi-Modal Information Fusion
Han, Song
Liu, Xiaoping
Wang, Gang
APPLIED SCIENCES-BASEL, 2022, 12 (06):
[28] Evaluation Method of Teaching Styles Based on Multi-modal Fusion
Tang, Wen
Wang, Chongwen
Zhang, Yi
2021 THE 7TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING, ICCIP 2021, 2021, : 9 - 15
[29] Multi-modal Video Action Recognition Method Based on Language-visual Contrastive Learning
Zhang Y.
Zhang B.-B.
Dong W.
An F.-M.
Zhang J.-X.
Zhang Q.
Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (02): : 417 - 430
[30] Visual-guided hierarchical iterative fusion for multi-modal video action recognition
Zhang, Bingbing
Zhang, Ying
Zhang, Jianxin
Sun, Qiule
Wang, Rong
Zhang, Qiang
Pattern Recognition Letters, 2024, 186 : 213 - 220

← 1 2 3 4 5 →