Multimodal multilevel attention for semi-supervised skeleton-based gesture recognition

被引：0

作者：

Liu, Jinting ^{[1
]}

Gan, Minggang ^{[1
]}

He, Yuxuan ^{[1
]}

Guo, Jia ^{[1
]}

Hu, Kang ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Automat, State Key Lab Intelligent Control & Decis Complex, Beijing, Peoples R China

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2025年 / 11卷 / 04期

关键词：

Gesture recognition; Skeleton; Self-attention; Semi-supervised; Deep learning; NEURAL-NETWORKS; FUSION;

D O I：

10.1007/s40747-025-01807-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although skeleton-based gesture recognition using supervised learning has achieved promising results, the reliance on extensive annotated data poses significant costs. This paper addresses the challenge of semi-supervised skeleton-based gesture recognition, to effectively learn feature representations from labeled and unlabeled data. To resolve this problem, we propose a novel multimodal multilevel attention network designed for semi-supervised learning. This model utilizes the self-attention mechanism to polymerize multimodal and multilevel complementary semantic information of the hand skeleton, designing a multimodal multilevel contrastive loss to measure feature similarity. Specifically, our method explores the relationships between joint, bone, and motion to learn more discriminative feature representations. Considering the hierarchy of the hand skeleton, the skeleton data is divided into multilevel to capture complementary semantic information. Furthermore, the multimodal contrastive loss measures similarity among these multilevel representations. The proposed method demonstrates improved performance in semi-supervised skeleton-based gesture recognition tasks, as evidenced by experiments on the SHREC-17 and DHG 14/28 datasets.

引用

页数：16

共 50 条

[1] Quantized depth image and skeleton-based multimodal dynamic hand gesture recognition
Mahmud, Hasan
Morshed, Mashrur M.
Hasan, Md. Kamrul
VISUAL COMPUTER, 2024, 40 (01): : 11 - 25
[2] X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition
Xu, Binqian
Shu, Xiangbo
Song, Yan
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3852 - 3867
[3] Quantized depth image and skeleton-based multimodal dynamic hand gesture recognition
Hasan Mahmud
Mashrur M. Morshed
Md. Kamrul Hasan
The Visual Computer, 2024, 40 : 11 - 25
[4] A Hand Gesture Recognition Model Based on Semi-supervised Learning
Tao, Meiping
Ma, Li
2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, 2015,
[5] Semi-Supervised Skeleton-Based Covert Cheating Detection in Electronic-Exams
Atabay, Habibollah Agh
Hassanpour, Hamid
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2024, 48 (04) : 1539 - 1551
[6] Skeleton-based Dynamic hand gesture recognition
De Smedt, Quentin
Wannous, Hazem
Vandeborre, Jean-Philippe
PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 1206 - 1214
[7] Multi-Granularity Anchor-Contrastive Representation Learning for Semi-Supervised Skeleton-Based Action Recognition
Shu, Xiangbo
Xu, Binqian
Zhang, Liyan
Tang, Jinhui
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7559 - 7576
[8] Semi-Supervised Learning for Surface EMG-based Gesture Recognition
Du, Yu
Wong, Yongkang
Jin, Wenguang
Wei, Wentao
Hu, Yu
Kankanhalli, Mohan
Geng, Weidong
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1624 - 1630
[9] HAN: An efficient hierarchical self-attention network for skeleton-based gesture recognition
Liu, Jianbo
Wang, Ying
Xiang, Shiming
Pan, Chunhong
PATTERN RECOGNITION, 2025, 162
[10] A semi-supervised human action recognition algorithm based on skeleton feature
Yuan, Hejin
Journal of Information Hiding and Multimedia Signal Processing, 2015, 6 (01): : 175 - 182

← 1 2 3 4 5 →