Weighted Multi-modal Sign Language Recognition

被引:0
|
作者
Liu, Edmond [1 ]
Lim, Jong Yoon [1 ]
MacDonald, Bruce [1 ]
Ahn, Ho Seok [1 ]
机构
[1] Univ Auckland, Dept Elect Comp & Software Engn, Fac Engn, Auckland, New Zealand
关键词
D O I
10.1109/RO-MAN60168.2024.10731214
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple modalities can boost accuracy in the difficult task of Sign Language Recognition (SLR), however, each modality does not necessarily contribute the same quality of information. Current multi-modal approaches assign the same importance weightings to each modality, or set weightings based on unproven heuristics. This paper takes a systematic approach to find the optimal weights by performing grid search. Firstly, we create a multi-modal version of the RGB only WLASL100 data with additional hand crop and skeletal pose modalities. Secondly, we create a 3D CNN based weighted multi-modal sign language network (WMSLRnet). Finally, we run various grid searches to find the optimal weightings for each modality. We show that very minor adjustments in the weightings can have major effects on the final SLR accuracy. On WLASL100, we significantly outperform previous networks of similar design, and achieve high accuracy in SLR without highly complex pre-training schemes or extra data.
引用
收藏
页码:880 / 885
页数:6
相关论文
共 50 条
  • [31] Emotion Recognition from Multi-Modal Information
    Wu, Chung-Hsien
    Lin, Jen-Chun
    Wei, Wen-Li
    Cheng, Kuan-Chun
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [32] Multi-modal broad learning for material recognition
    Wang, Zhaoxin
    Liu, Huaping
    Xu, Xinying
    Sun, Fuchun
    COGNITIVE COMPUTATION AND SYSTEMS, 2021, 3 (02) : 123 - 130
  • [33] Multi-modal Laughter Recognition in Video Conversations
    Escalera, Sergio
    Puertas, Eloi
    Radeva, Petia
    Pujol, Oriol
    2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2, 2009, : 869 - 874
  • [34] Multi-modal person recognition for vehicular applications
    Erdogan, H
    Erçil, A
    Ekenel, HK
    Bilgin, SY
    Eden, I
    Kirisçi, M
    Abut, H
    MULTIPLE CLASSIFIER SYSTEMS, 2005, 3541 : 366 - 375
  • [35] Fusing Multi-modal Features for Gesture Recognition
    Wu, Jiaxiang
    Cheng, Jian
    Zhao, Chaoyang
    Lu, Hanqing
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 453 - 459
  • [36] Multi-modal deep learning for landform recognition
    Du, Lin
    You, Xiong
    Li, Ke
    Meng, Liqiu
    Cheng, Gong
    Xiong, Liyang
    Wang, Guangxia
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 158 : 63 - 75
  • [37] Multi-modal Emotion Recognition Based on Hypergraph
    Zong L.-L.
    Zhou J.-H.
    Xie Q.-J.
    Zhang X.-C.
    Xu B.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (12): : 2520 - 2534
  • [38] Multi-modal Sensing for Human Activity Recognition
    Bruno, Barbara
    Grosinger, Jasmin
    Mastrogiovanni, Fulvio
    Pecora, Federico
    Saffiotti, Alessandro
    Sathyakeerthy, Subhash
    Sgorbissa, Antonio
    2015 24TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2015, : 594 - 600
  • [39] Comparison on Multi-modal Biometric Recognition Method
    Amirthalingam, Gandhimathi
    Govindaraju, Radhamani
    COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2011, 250 : 524 - +
  • [40] Evaluation and Discussion of Multi-modal Emotion Recognition
    Rabie, Ahmad
    Wrede, Britta
    Vogt, Thurid
    Hanheide, Marc
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 598 - +