Multimodal multilevel attention for semi-supervised skeleton-based gesture recognition

被引：0

作者：

Liu, Jinting ^{[1
]}

Gan, Minggang ^{[1
]}

He, Yuxuan ^{[1
]}

Guo, Jia ^{[1
]}

Hu, Kang ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Automat, State Key Lab Intelligent Control & Decis Complex, Beijing, Peoples R China

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2025年 / 11卷 / 04期

关键词：

Gesture recognition; Skeleton; Self-attention; Semi-supervised; Deep learning; NEURAL-NETWORKS; FUSION;

D O I：

10.1007/s40747-025-01807-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although skeleton-based gesture recognition using supervised learning has achieved promising results, the reliance on extensive annotated data poses significant costs. This paper addresses the challenge of semi-supervised skeleton-based gesture recognition, to effectively learn feature representations from labeled and unlabeled data. To resolve this problem, we propose a novel multimodal multilevel attention network designed for semi-supervised learning. This model utilizes the self-attention mechanism to polymerize multimodal and multilevel complementary semantic information of the hand skeleton, designing a multimodal multilevel contrastive loss to measure feature similarity. Specifically, our method explores the relationships between joint, bone, and motion to learn more discriminative feature representations. Considering the hierarchy of the hand skeleton, the skeleton data is divided into multilevel to capture complementary semantic information. Furthermore, the multimodal contrastive loss measures similarity among these multilevel representations. The proposed method demonstrates improved performance in semi-supervised skeleton-based gesture recognition tasks, as evidenced by experiments on the SHREC-17 and DHG 14/28 datasets.

引用

页数：16

共 50 条

[31] SPD Siamese Neural Network for Skeleton-based Hand Gesture Recognition
Akremi, Mohamed Sanim
Slama, Rim
Tabia, Hedi
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 394 - 402
[32] mmGesture: Semi-supervised gesture recognition system using mmWave radar
Yan, Baiju
Wang, Peng
Du, Lidong
Chen, Xianxiang
Fang, Zhen
Wu, Yirong
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[33] Gesture recognition based on multilevel multimodal feature fusion
Tian, Jinrong
Cheng, Wentao
Sun, Ying
Li, Gongfa
Jiang, Du
Jiang, Guozhang
Tao, Bo
Zhao, Haoyi
Chen, Disi
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 2539 - 2550
[34] Temporal Decoupling Graph Convolutional Network for Skeleton-Based Gesture Recognition
Liu, Jinfu
Wang, Xinshun
Wang, Can
Gao, Yuan
Liu, Mengyuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 811 - 823
[35] Decoupled and boosted learning for skeleton-based dynamic hand gesture recognition
Li, Yangke
Wei, Guangshun
Desrosiers, Christian
Zhou, Yuanfeng
PATTERN RECOGNITION, 2024, 153
[36] Fusing Skeleton-Based Scene Flow for Gesture Recognition on Point Clouds
Liu, Yahui
Jiao, Jiajia
ELECTRONICS, 2025, 14 (03):
[37] Skeleton-Based Action and Gesture Recognition for Human-Robot Collaboration
Terreran, Matteo
Lazzaretto, Margherita
Ghidoni, Stefano
INTELLIGENT AUTONOMOUS SYSTEMS 17, IAS-17, 2023, 577 : 29 - 45
[38] DIFFERENTIAL PSEUDO-IMAGE FOR SKELETON-BASED DYNAMIC GESTURE RECOGNITION
Kapuscinski, Tomasz
Mis, Mateusz
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4203 - 4207
[39] Compact joints encoding for skeleton-based dynamic hand gesture recognition
Li, Yangke
Ma, Dongyang
Yu, Yuhang
Wei, Guangshun
Zhou, Yuanfeng
COMPUTERS & GRAPHICS-UK, 2021, 97 : 191 - 199
[40] Semi-Supervised SAR Target Recognition with Graph Attention Network
Wen, Liwu
Huang, Xuejun
Qin, Siqi
Ding, Jinshan
13TH EUROPEAN CONFERENCE ON SYNTHETIC APERTURE RADAR, EUSAR 2021, 2021, : 378 - 382

← 1 2 3 4 5 →