Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition

被引：0

作者：

Yu, Bruce X. B. ^{[1
]}

Liu, Yan ^{[1
]}

Chan, Keith C. C. ^{[1
]}

机构：

[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

来源：

THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2021年 / 35卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Indoor action recognition plays an important role in modern society, such as intelligent healthcare in large mobile cabin hospitals. With the wide usage of depth sensors like Kinect, multimodal information including skeleton and RGB modalities brings a promising way to improve the performance. However, existing methods are either focusing on a single data modality or failed to take the advantage of multiple data modalities. In this paper, we propose a Teacher-Student Multimodal Fusion (TSMF) model that fuses the skeleton and RGB modalities at the model level for indoor action recognition. In our TSMF, we utilize a teacher network to transfer the structural knowledge of the skeleton modality to a student network for the RGB modality. With extensive experiments on two benchmarking datasets: NTU RGB+D and PKU-MMD, results show that the proposed TSMF consistently performs better than state-of-the-art single modal and multimodal methods. It also indicates that our TSMF could not only improve the accuracy of the student network but also significantly improve the ensemble accuracy.

引用

页码：3199 / 3207

页数：9

共 50 条

[31] Cross-datasets facial expression recognition via distance metric learning and teacher-student model
Hao Meng
Fei Yuan
Yang Tian
Tianhao Yan
[J]. Multimedia Tools and Applications, 2022, 81 : 5621 - 5643
[32] A novel dataset based on indoor teacher-student interactive mode using AIoT
Zhao, Jian
Xu, Maolin
Wang, Xuezhu
[J]. INTERNET OF THINGS, 2024, 25
[33] Transformer fault diagnosis based on relational teacher-student network
Yin Sihan
Li Yalei
Liu Xiaoping
Cui Xu
Wang Huapeng
[J]. The Journal of China Universities of Posts and Telecommunications, 2023, 30 (03) : 41 - 54
[34] Dynamic Self-Supervised Teacher-Student Network Learning
Ye, Fei
Bors, Adrian G.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5731 - 5748
[35] Cross-datasets facial expression recognition via distance metric learning and teacher-student model
Meng, Hao
Yuan, Fei
Tian, Yang
Yan, Tianhao
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 5621 - 5643
[36] Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation
Ma, Han
Zhang, Qiaoling
Tang, Roubing
Zhang, Lu
Jia, Yubo
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (12) : 2112 - 2118
[37] A deep multimodal network based on bottleneck layer features fusion for action recognition
Singh, Tej
Vishwakarma, Dinesh Kumar
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (24) : 33505 - 33525
[38] A deep multimodal network based on bottleneck layer features fusion for action recognition
Tej Singh
Dinesh Kumar Vishwakarma
[J]. Multimedia Tools and Applications, 2021, 80 : 33505 - 33525
[39] The Recognition of Teacher Behavior Based on Multimodal Information Fusion
Wu, Dongli
Chen, Jia
Deng, Wei
Wei, Yantao
Luo, Heng
Wei, Yangyu
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020 (2020)
[40] Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning
Wu, Long
Chen, Hangting
Wang, Li
Zhang, Pengyuan
Yan, Yonghong
[J]. INTERSPEECH 2019, 2019, : 431 - 435

← 1 2 3 4 5 →