A Neural Attention-Based Encoder-Decoder Approach for English to Bangla Translation

被引:0
|
作者
Al Shiam, Abdullah [1 ]
Redwan, Sadi Md. [2 ]
Kabir, Humaun [3 ]
Shin, Jungpil [4 ]
机构
[1] Sheikh Hasina Univ, Dept Comp Sci & Engn, Netrokona 2400, Bangladesh
[2] Univ Rajshahi, Dept Comp Sci & Engn, Rajshahi 6205, Bangladesh
[3] Bangamata Sheikh Fojilatunnesa Mujib Sci & Technol, Dept Comp Sci & Engn, Jamalpur 2012, Bangladesh
[4] Univ Aizu Aizuwakamatsu, Sch Comp Sci & Engn, Fukushima 9658580, Japan
关键词
Neural Machine Translation (NMT); Machine Translation (MT); Encoder-Decoder Model; Neural Attention;
D O I
10.56415/csjm.v31.04
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Machine translation (MT) is the process of translating text from one language to another using bilingual data sets and gram-matical rules. Recent works in the field of MT have popular-ized sequence-to-sequence models leveraging neural attention and deep learning. The success of neural attention models is yet to be construed into a robust framework for automated English-to-Bangla translation due to a lack of a comprehensive dataset that encompasses the diverse vocabulary of the Bangla language. In this study, we have proposed an English-to-Bangla MT system using an encoder-decoder attention model using the CCMatrix corpus. Our method shows that this model can outperform tra-ditional SMT and RBMT models with a Bilingual Evaluation Understudy (BLEU) score of 15.68 despite being constrained by the limited vocabulary of the corpus. We hypothesize that this model can be used successfully for state-of-the-art machine trans-lation with a more diverse and accurate dataset. This work can be extended further to incorporate several newer datasets using transfer learning techniques.
引用
收藏
页码:70 / 85
页数:16
相关论文
共 50 条
  • [1] BaNeL: an encoder-decoder based Bangla neural lemmatizer
    Islam, Md Ashraful
    Towhiduzzaman, Md
    Bhuiyan, Md Tauhidul Islam
    Al Maruf, Abdullah
    Ovi, Jesan Ahammed
    [J]. SN APPLIED SCIENCES, 2022, 4 (05)
  • [2] BaNeL: an encoder-decoder based Bangla neural lemmatizer
    Md. Ashraful Islam
    Md. Towhiduzzaman
    Md. Tauhidul Islam Bhuiyan
    Abdullah Al Maruf
    Jesan Ahammed Ovi
    [J]. SN Applied Sciences, 2022, 4
  • [3] Attention-based encoder-decoder networks for workflow recognition
    Min Zhang
    Haiyang Hu
    Zhongjin Li
    Jie Chen
    [J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
  • [4] Attention-based encoder-decoder networks for workflow recognition
    Zhang, Min
    Hu, Haiyang
    Li, Zhongjin
    Chen, Jie
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 34973 - 34995
  • [5] Video Summarization With Attention-Based Encoder-Decoder Networks
    Ji, Zhong
    Xiong, Kailin
    Pang, Yanwei
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1709 - 1717
  • [6] Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer
    Chen, Zhengyang
    Han, Bing
    Wang, Shuai
    Qian, Yanmin
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1636 - 1649
  • [7] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
    Xu, Haixia
    Huang, Yunjia
    Hancock, Edwin R.
    Wang, Shuailong
    Xuan, Qijun
    Zhou, Wei
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
  • [8] Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly Detection
    Wu, Shang
    Wang, Yijie
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1452 - 1459
  • [9] ATTENTION-BASED ENCODER-DECODER NETWORK FOR SINGLE IMAGE DEHAZING
    Gao, Shunan
    Zhu, Jinghua
    Xi, Heran
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [10] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
    Prabu, S.
    Sundar, K. Joseph Abraham
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086