Character-level arabic text generation from sign language video using encoder-decoder model

被引:2
|
作者
Boukdir, Abdelbasset [1 ]
Benaddy, Mohamed [1 ]
El Meslouhi, Othmane [2 ]
Kardouchi, Mustapha [3 ]
Akhloufi, Moulay [3 ]
机构
[1] Ibn Zohr Univ, FSA PFO, LabSI Lab, Ouarzazate, Morocco
[2] Cadi Ayyad Univ, Natl Sch Appl Sci Safi, SARS Grp, Safi, Morocco
[3] Univ Moncton, Dept Comp Sci, PRIME Grp, Moncton, NB, Canada
关键词
Arabic text; Pose estimation; Video caption; Deep learning; Gated Recurrent Unit; NEURAL-NETWORK;
D O I
10.1016/j.displa.2022.102340
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Video to text conversion is a vital activity in the field of computer vision. In recent years, deep learning algorithms have dominated automatic text generation in English, but there are a few research works available for other languages. In this paper, we propose a novel encoding-decoding system that generates character-level Arabic sentences from isolated RGB videos of Moroccan sign language. The video sequence was encoded by a spatiotemporal feature extraction using pose estimation models, while the label text of the video is transmitted to a sequence of representative vectors. Both the features and the label vector are joined and treated by a decoder layer to derive a final prediction. We trained the proposed system on an isolated Moroccan Sign Language dataset (MoSLD), composed of RGB videos from 125 MoSL signs. The experimental results reveal that the proposed model attains the best performance under several evaluation metrics.
引用
收藏
页数:9
相关论文
共 33 条
  • [1] DLCEncDec : A Fully Character-level Encoder-Decoder Model for Neural Responding Conversation
    Wu, Sixing
    Li, Ying
    Zhang, Xinyuan
    Wu, Zhonghai
    2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 516 - 521
  • [2] Correlation Encoder-Decoder Model for Text Generation
    Zhang, Xu
    Li, Yifeng
    Peng, Xueping
    Qiao, Xinxiao
    Zhang, Hui
    Lu, Wenpeng
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [3] Video to Text Study using an Encoder-Decoder Networks Approach
    Ismael Orozco, Carlos
    Elena Buemi, Maria
    Jacobo Berlles, Julio
    2018 37TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2018,
  • [4] Encoder-decoder recurrent network model for interactive character animation generation
    Wang, Yumeng
    Che, Wujun
    Xu, Bo
    VISUAL COMPUTER, 2017, 33 (6-8): : 971 - 980
  • [5] Using Character-Level Sequence-to-Sequence Model for Word Level Text Generation to Enhance Arabic Speech Recognition
    Azim, Mona A.
    Hussein, Wedad
    Badr, Nagwa L.
    IEEE ACCESS, 2023, 11 : 91173 - 91183
  • [6] Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder
    Vosoughi, Soroush
    Vijayaraghavan, Prashanth
    Roy, Deb
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 1041 - 1044
  • [7] Encoder-Decoder Couplet Generation Model Based on 'Trapezoidal Context' Character Vector
    Gao, Rui
    Zhu, Yuanyuan
    Li, Mingye
    Li, Shoufeng
    Shi, Xiaohu
    COMPUTER JOURNAL, 2021, 64 (03): : 286 - 295
  • [8] Encoder-Decoder Model for Automatic Video Captioning Using Yolo Algorithm
    Alkalouti, Hanan Nasser
    Al Masre, Mayada Ahmed
    2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 718 - 721
  • [9] Arabic Machine Transliteration using an Attention-based Encoder-decoder Model
    Ameur, Mohamed Seghir Hadj
    Meziane, Farid
    Guessoum, Ahmed
    ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 287 - 297
  • [10] Text Recognition on Khmer Historical Documents using Glyph Class Map Generation with Encoder-Decoder Model
    Valy, Dona
    Verleysen, Michel
    Chhun, Sophea
    ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 749 - 756