Character-level arabic text generation from sign language video using encoder-decoder model

被引：2

作者：

Boukdir, Abdelbasset ^{[1
]}

Benaddy, Mohamed ^{[1
]}

El Meslouhi, Othmane ^{[2
]}

Kardouchi, Mustapha ^{[3
]}

Akhloufi, Moulay ^{[3
]}

机构：

[1] Ibn Zohr Univ, FSA PFO, LabSI Lab, Ouarzazate, Morocco

[2] Cadi Ayyad Univ, Natl Sch Appl Sci Safi, SARS Grp, Safi, Morocco

[3] Univ Moncton, Dept Comp Sci, PRIME Grp, Moncton, NB, Canada

来源：

DISPLAYS | 2023年 / 76卷

关键词：

Arabic text; Pose estimation; Video caption; Deep learning; Gated Recurrent Unit; NEURAL-NETWORK;

D O I：

10.1016/j.displa.2022.102340

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding-decoding system that generates character-level Arabic sentences from isolated RGB videos of Moroccan sign language. The video sequence was encoded by a spatiotemporal feature extraction using pose estimation models, while the label text of the video is transmitted to a sequence of representative vectors. Both the features and the label vector are joined and treated by a decoder layer to derive a final prediction. We trained the proposed system on an isolated Moroccan Sign Language dataset (MoSLD), composed of RGB videos from 125 MoSL signs. The experimental results reveal that the proposed model attains the best performance under several evaluation metrics.

引用

页数：9

共 33 条

[31] 3D gesture segmentation for word-level Arabic sign language using large-scale RGB video sequences and autoencoder convolutional networks
Abdelbasset Boukdir
Mohamed Benaddy
Ayoub Ellahyani
Othmane El Meslouhi
Mustapha Kardouchi
Signal, Image and Video Processing, 2022, 16 : 2055 - 2062
[32] 3D gesture segmentation for word-level Arabic sign language using large-scale RGB video sequences and autoencoder convolutional networks
Boukdir, Abdelbasset
Benaddy, Mohamed
Ellahyani, Ayoub
El Meslouhi, Othmane
Kardouchi, Mustapha
SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (08) : 2055 - 2062
[33] High-Level Synthesis Revised: Generation of FPGA Accelerators from a Domain-Specific Language using the Polyhedron Model
Schmid, Moritz
Hannig, Frank
Tanase, Alexandru
Teich, Juergen
PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 497 - 506

← 1 2 3 4 →