Multimodal multilevel attention for semi-supervised skeleton-based gesture recognition

被引:0
|
作者
Liu, Jinting [1 ]
Gan, Minggang [1 ]
He, Yuxuan [1 ]
Guo, Jia [1 ]
Hu, Kang [1 ]
机构
[1] Beijing Inst Technol, Sch Automat, State Key Lab Intelligent Control & Decis Complex, Beijing, Peoples R China
关键词
Gesture recognition; Skeleton; Self-attention; Semi-supervised; Deep learning; NEURAL-NETWORKS; FUSION;
D O I
10.1007/s40747-025-01807-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although skeleton-based gesture recognition using supervised learning has achieved promising results, the reliance on extensive annotated data poses significant costs. This paper addresses the challenge of semi-supervised skeleton-based gesture recognition, to effectively learn feature representations from labeled and unlabeled data. To resolve this problem, we propose a novel multimodal multilevel attention network designed for semi-supervised learning. This model utilizes the self-attention mechanism to polymerize multimodal and multilevel complementary semantic information of the hand skeleton, designing a multimodal multilevel contrastive loss to measure feature similarity. Specifically, our method explores the relationships between joint, bone, and motion to learn more discriminative feature representations. Considering the hierarchy of the hand skeleton, the skeleton data is divided into multilevel to capture complementary semantic information. Furthermore, the multimodal contrastive loss measures similarity among these multilevel representations. The proposed method demonstrates improved performance in semi-supervised skeleton-based gesture recognition tasks, as evidenced by experiments on the SHREC-17 and DHG 14/28 datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] SPD Siamese Neural Network for Skeleton-based Hand Gesture Recognition
    Akremi, Mohamed Sanim
    Slama, Rim
    Tabia, Hedi
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 394 - 402
  • [32] mmGesture: Semi-supervised gesture recognition system using mmWave radar
    Yan, Baiju
    Wang, Peng
    Du, Lidong
    Chen, Xianxiang
    Fang, Zhen
    Wu, Yirong
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [33] Gesture recognition based on multilevel multimodal feature fusion
    Tian, Jinrong
    Cheng, Wentao
    Sun, Ying
    Li, Gongfa
    Jiang, Du
    Jiang, Guozhang
    Tao, Bo
    Zhao, Haoyi
    Chen, Disi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 2539 - 2550
  • [34] Temporal Decoupling Graph Convolutional Network for Skeleton-Based Gesture Recognition
    Liu, Jinfu
    Wang, Xinshun
    Wang, Can
    Gao, Yuan
    Liu, Mengyuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 811 - 823
  • [35] Decoupled and boosted learning for skeleton-based dynamic hand gesture recognition
    Li, Yangke
    Wei, Guangshun
    Desrosiers, Christian
    Zhou, Yuanfeng
    PATTERN RECOGNITION, 2024, 153
  • [36] Fusing Skeleton-Based Scene Flow for Gesture Recognition on Point Clouds
    Liu, Yahui
    Jiao, Jiajia
    ELECTRONICS, 2025, 14 (03):
  • [37] Skeleton-Based Action and Gesture Recognition for Human-Robot Collaboration
    Terreran, Matteo
    Lazzaretto, Margherita
    Ghidoni, Stefano
    INTELLIGENT AUTONOMOUS SYSTEMS 17, IAS-17, 2023, 577 : 29 - 45
  • [38] DIFFERENTIAL PSEUDO-IMAGE FOR SKELETON-BASED DYNAMIC GESTURE RECOGNITION
    Kapuscinski, Tomasz
    Mis, Mateusz
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4203 - 4207
  • [39] Compact joints encoding for skeleton-based dynamic hand gesture recognition
    Li, Yangke
    Ma, Dongyang
    Yu, Yuhang
    Wei, Guangshun
    Zhou, Yuanfeng
    COMPUTERS & GRAPHICS-UK, 2021, 97 : 191 - 199
  • [40] Semi-Supervised SAR Target Recognition with Graph Attention Network
    Wen, Liwu
    Huang, Xuejun
    Qin, Siqi
    Ding, Jinshan
    13TH EUROPEAN CONFERENCE ON SYNTHETIC APERTURE RADAR, EUSAR 2021, 2021, : 378 - 382