Character-level arabic text generation from sign language video using encoder-decoder model

被引：2

作者：

Boukdir, Abdelbasset ^{[1
]}

Benaddy, Mohamed ^{[1
]}

El Meslouhi, Othmane ^{[2
]}

Kardouchi, Mustapha ^{[3
]}

Akhloufi, Moulay ^{[3
]}

机构：

[1] Ibn Zohr Univ, FSA PFO, LabSI Lab, Ouarzazate, Morocco

[2] Cadi Ayyad Univ, Natl Sch Appl Sci Safi, SARS Grp, Safi, Morocco

[3] Univ Moncton, Dept Comp Sci, PRIME Grp, Moncton, NB, Canada

来源：

DISPLAYS | 2023年 / 76卷

关键词：

Arabic text; Pose estimation; Video caption; Deep learning; Gated Recurrent Unit; NEURAL-NETWORK;

D O I：

10.1016/j.displa.2022.102340

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding-decoding system that generates character-level Arabic sentences from isolated RGB videos of Moroccan sign language. The video sequence was encoded by a spatiotemporal feature extraction using pose estimation models, while the label text of the video is transmitted to a sequence of representative vectors. Both the features and the label vector are joined and treated by a decoder layer to derive a final prediction. We trained the proposed system on an isolated Moroccan Sign Language dataset (MoSLD), composed of RGB videos from 125 MoSL signs. The experimental results reveal that the proposed model attains the best performance under several evaluation metrics.

引用

页数：9

共 33 条

[21] Code generation from a graphical user interface via attention-based encoder-decoder model
Chen, Wen Yin
Podstreleny, Pavol
Cheng, Wen-Huang
Chen, Yung-Yao
Hua, Kai-Lung
MULTIMEDIA SYSTEMS, 2022, 28 (01) : 121 - 130
[22] LGI-rPPG-Net: A shallow encoder-decoder model for rPPG signal estimation from facial video streams
Chowdhury, Moajjem Hossain
Chowdhury, Muhammad E. H.
Reaz, Mamun Bin Ibne
Ali, Sawal Hamid Md
Rakhtala, Seyed Mehdi
Murugappan, M.
Mahmud, Sakib
Shuzan, Nazmul Islam
Bakar, Ahmad Ashrif A.
Abd Razak, Mohd Ibrahim Bin Shapiai
Khan, Muhammad Salman
Khandakar, Amith
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
[23] Improving code extraction from coding screencasts using a code-aware encoder-decoder model
Malkadi, Abdulkarim
Tayeb, Ahmad
Haiduc, Sonia
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1492 - 1504
[24] Text generation from Taiwanese Sign Language using a PST-based language model for augmentative communication
Wu, CH
Chiu, YH
Guo, CS
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2004, 12 (04) : 441 - 454
[25] Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study
Xiong, Ying
Chen, Shuai
Chen, Qingcai
Yan, Jun
Tang, Buzhou
JMIR MEDICAL INFORMATICS, 2020, 8 (12)
[26] Enhanced model for abstractive Arabic text summarization using natural language generation and named entity recognition
Nada Essa
M. M. El-Gayar
Eman M. El-Daydamony
Neural Computing and Applications, 2025, 37 (10) : 7279 - 7301
[27] Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
Chen, Yen-Pin
Chen, Yi-Ying
Lin, Jr-Jiun
Huang, Chien-Hua
Lai, Feipei
JMIR MEDICAL INFORMATICS, 2020, 8 (04)
[28] There and Back Again: 3D Sign Language Generation from Text Using Back-Translation
Stoll, Stephanie
Mustafa, Armin
Guillemaut, Jean-Yves
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 187 - 196
[29] Dynamic GAN for high-quality sign language video generation from skeletal poses using generative adversarial networks
Natarajan, B.
Elakkiya, R.
SOFT COMPUTING, 2022, 26 (23) : 13153 - 13175
[30] Dynamic GAN for high-quality sign language video generation from skeletal poses using generative adversarial networks
B. Natarajan
R. Elakkiya
Soft Computing, 2022, 26 : 13153 - 13175

← 1 2 3 4 →