Weighted Multi-modal Sign Language Recognition

被引：0

作者：

Liu, Edmond ^{[1
]}

Lim, Jong Yoon ^{[1
]}

MacDonald, Bruce ^{[1
]}

Ahn, Ho Seok ^{[1
]}

机构：

[1] Univ Auckland, Dept Elect Comp & Software Engn, Fac Engn, Auckland, New Zealand

来源：

2024 33RD IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, ROMAN 2024 | 2024年

关键词：

D O I：

10.1109/RO-MAN60168.2024.10731214

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multiple modalities can boost accuracy in the difficult task of Sign Language Recognition (SLR), however, each modality does not necessarily contribute the same quality of information. Current multi-modal approaches assign the same importance weightings to each modality, or set weightings based on unproven heuristics. This paper takes a systematic approach to find the optimal weights by performing grid search. Firstly, we create a multi-modal version of the RGB only WLASL100 data with additional hand crop and skeletal pose modalities. Secondly, we create a 3D CNN based weighted multi-modal sign language network (WMSLRnet). Finally, we run various grid searches to find the optimal weightings for each modality. We show that very minor adjustments in the weightings can have major effects on the final SLR accuracy. On WLASL100, we significantly outperform previous networks of similar design, and achieve high accuracy in SLR without highly complex pre-training schemes or extra data.

引用

页码：880 / 885

页数：6

共 50 条

[1] Skeleton aware multi-modal sign language recognition
Jiang, Songyao
Sun, Bin
Wang, Lichen
Bai, Yue
Li, Kunpeng
Fu, Yun
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2021, : 3408 - 3418
[2] Skeleton aware multi-modal sign language recognition
Jiang, Songyao
Sun, Bin
Wang, Lichen
Bai, Yue
Li, Kunpeng
Fu, Yun
arXiv, 2021,
[3] Skeleton Aware Multi-modal Sign Language Recognition
Jiang, Songyao
Sun, Bin
Wang, Lichen
Bai, Yue
Li, Kunpeng
Fu, Yun
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3408 - 3418
[4] Multi-modal Sign Language Recognition with Enhanced Spatiotemporal Representation
Xiao, Shiwei
Fang, Yuchun
Ni, Lan
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[5] MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition
Sadeghzadeh, Arezoo
Shah, A. F. M. Shahen
Islam, Md Baharul
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
[6] Multi-Modal Fusion Sign Language Recognition Based on Residual Network and Attention Mechanism
Chu Chaoqin
Xiao Qinkun
Zhang Yinhuan
Xing, Liu
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (12)
[7] On the use of Multi-Modal Sensing in Sign Language Classification
Sharma, Sneha
Gupta, Rinki
Kumar, Arun
2019 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2019, : 495 - 500
[8] Multi-modal Dialogue System with Sign Language Capabilities
Hruz, M.
Campr, P.
Krnoul, Z.
Zelezny, M.
Aran, Oya
Santemiz, Pinar
ASSETS 11: PROCEEDINGS OF THE 13TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2011, : 265 - 266
[9] Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
Rastgoo, Razieh
Kiani, Kourosh
Escalera, Sergio
ENTROPY, 2018, 20 (11)
[10] Traffic sign recognition algorithm design based on multi-modal representation
Cai, Z.-X. (zxcai@csu.edu.cn), 1600, Northeast University (28):

← 1 2 3 4 5 →