Neural Machine Translation Models with Attention-Based Dropout Layer

被引:2
|
作者
Israr, Huma [1 ]
Khan, Safdar Abbas [1 ]
Tahir, Muhammad Ali [1 ]
Shahzad, Muhammad Khuram [1 ]
Ahmad, Muneer [1 ]
Zain, Jasni Mohamad [2 ]
机构
[1] Natl Univ Sci & Technol, Sch Elect Engn & Comp Sci SEECS, Islamabad, Pakistan
[2] Univ Teknol MARA, Inst Big Data Analyt & Artificial Intelligence IBD, Kompleks Al Khawarizmi, Shah Alam 40450, Selangor, Malaysia
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 75卷 / 02期
关键词
Natural language processing; neural machine translation; word embedding; attention; perplexity; selective dropout; regularization; Urdu; Persian; Arabic; BLEU; URDU;
D O I
10.32604/cmc.2023.035814
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In bilingual translation, attention-based Neural Machine Trans-lation (NMT) models are used to achieve synchrony between input and output sequences and the notion of alignment. NMT model has obtained state-of-the-art performance for several language pairs. However, there has been little work exploring useful architectures for Urdu-to-English machine translation. We conducted extensive Urdu-to-English translation experiments using Long short-term memory (LSTM)/Bidirectional recurrent neural networks (Bi-RNN)/Statistical recurrent unit (SRU)/Gated recurrent unit (GRU)/Convolutional neural network (CNN) and Transformer. Experimen-tal results show that Bi-RNN and LSTM with attention mechanism trained iteratively, with a scalable data set, make precise predictions on unseen data. The trained models yielded competitive results by achieving 62.6% and 61% accuracy and 49.67 and 47.14 BLEU scores, respectively. From a qualitative perspective, the translation of the test sets was examined manually, and it was observed that trained models tend to produce repetitive output more frequently. The attention score produced by Bi-RNN and LSTM produced clear alignment, while GRU showed incorrect translation for words, poor alignment and lack of a clear structure. Therefore, we considered refining the attention-based models by defining an additional attention-based dropout layer. Attention dropout fixes alignment errors and minimizes translation errors at the word level. After empirical demonstration and comparison with their counterparts, we found improvement in the quality of the resulting translation system and a decrease in the perplexity and over-translation score. The ability of the proposed model was evaluated using Arabic-English and Persian-English datasets as well. We empirically concluded that adding an attention-based dropout layer helps improve GRU, SRU, and Transformer translation and is considerably more efficient in translation quality and speed.
引用
收藏
页码:2981 / 3009
页数:29
相关论文
共 50 条
  • [1] Recursive Annotations for Attention-Based Neural Machine Translation
    Ye, Shaolin
    Guo, Wu
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 164 - 167
  • [2] An Effective Coverage Approach for Attention-based Neural Machine Translation
    Hoang-Quan Nguyen
    Thuan-Minh Nguyen
    Huy-Hien Vu
    Van-Vinh Nguyen
    Phuong-Thai Nguyen
    Thi-Nga-My Dao
    Kieu-Hue Tran
    Khac-Quy Dinh
    [J]. PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 240 - 245
  • [3] Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation
    Zhang, Jinchao
    Wang, Mingxuan
    Liu, Qun
    Zhou, Jie
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1524 - 1534
  • [4] Attention-based Dropout Layer for Weakly Supervised Object Localization
    Choe, Junsuk
    Shim, Hyunjung
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2214 - 2223
  • [5] Attention-based skill translation models for expert finding
    Fallahnejad, Zohreh
    Beigy, Hamid
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
  • [6] Towards the implementation of an Attention-based Neural Machine Translation with artificial pronunciation for Nahuatl as a mobile application
    Bello Garcia, Sergio Khalil
    Sanchez Lucero, Eduardo
    Pedroza Mendez, Blanca Estela
    Hernandez Hernandez, Jose Crispin
    Bonilla Huerta, Edmundo
    Ramirez Cruz, Jose Federico
    [J]. 2020 8TH EDITION OF THE INTERNATIONAL CONFERENCE IN SOFTWARE ENGINEERING RESEARCH AND INNOVATION (CONISOFT 2020), 2020, : 235 - 244
  • [7] Bilingual attention based neural machine translation
    Kang, Liyan
    He, Shaojie
    Wang, Mingxuan
    Long, Fei
    Su, Jinsong
    [J]. APPLIED INTELLIGENCE, 2023, 53 (04) : 4302 - 4315
  • [8] Bilingual attention based neural machine translation
    Liyan Kang
    Shaojie He
    Mingxuan Wang
    Fei Long
    Jinsong Su
    [J]. Applied Intelligence, 2023, 53 : 4302 - 4315
  • [9] Linearization Weight Compression and In-Situ Hardware-Based Decompression for Attention-Based Neural Machine Translation
    Go, Mijin
    Kong, Joonho
    Munir, Arslan
    [J]. IEEE ACCESS, 2023, 11 : 42751 - 42763
  • [10] Adapting Attention-Based Neural Network to Low-Resource Mongolian-Chinese Machine Translation
    Wu, Jing
    Hou, Hongxu
    Shen, Zhipeng
    Du, Jian
    Li, Jinting
    [J]. NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 470 - 480