Machine Translation Using Improved Attention-based Transformer with Hybrid Input

被引:0
|
作者
Abrishami, Mahsa [1 ]
Rashti, Mohammad Javad [1 ]
Naderan, Marjan [1 ]
机构
[1] Shahid Chamran Univ Ahvaz, Dept Comp Engn, Ahvaz, Iran
关键词
Neural Machine Translation; Self-Attention; Multi-Head Attention; Encoder-Decoder;
D O I
10.1109/icwr49608.2020.9122317
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine Translation (MT) refers to the automated software-based translation of natural language text. The embedded complexities and incompatibilities of natural languages have made MT a daunting task facing numerous challenges, especially when it is to be compared to a manual translation. With the emergence of deep-learning AI approaches, the Neural Machine Translation (NMT) has pushed MT results closer to human expectations. One of the newest deep learning approaches is the sequence-to-sequence approach based on Recurrent Neural Networks (RNN), complex convolutions, and transformers, and employing encoders/decoder pairs. In this study, an attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders. The main contributions of the proposed model lie in the weighted combination of layers' primary input and output of the previous layers, feeding into the next layer. This mechanism results in a more accurate transformation compared to non-hybrid inputs. The model is evaluated using two datasets for German/English translation, the WMT'14 dataset for training, and the newstest' 2012 dataset for testing. The experiments are run on GPU-equipped Google Colab instances and the results show an accuracy of 36.7 BLEU, a 5% improvement over the previous work without the hybrid-input technique.
引用
收藏
页码:52 / 57
页数:6
相关论文
共 50 条
  • [1] Recursive Annotations for Attention-Based Neural Machine Translation
    Ye, Shaolin
    Guo, Wu
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 164 - 167
  • [2] CRAN: An Hybrid CNN-RNN Attention-Based Model for Arabic Machine Translation
    Bensalah, Nouhaila
    Ayad, Habib
    Adib, Abdellah
    El Farouk, Abdelhamid Ibn
    [J]. NETWORKING, INTELLIGENT SYSTEMS AND SECURITY, 2022, 237 : 87 - 102
  • [3] Face-based age estimation using improved Swin Transformer with attention-based convolution
    Shi, Chaojun
    Zhao, Shiwei
    Zhang, Ke
    Wang, Yibo
    Liang, Longping
    [J]. FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [4] Neural Machine Translation Models with Attention-Based Dropout Layer
    Israr, Huma
    Khan, Safdar Abbas
    Tahir, Muhammad Ali
    Shahzad, Muhammad Khuram
    Ahmad, Muneer
    Zain, Jasni Mohamad
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2981 - 3009
  • [5] An Effective Coverage Approach for Attention-based Neural Machine Translation
    Hoang-Quan Nguyen
    Thuan-Minh Nguyen
    Huy-Hien Vu
    Van-Vinh Nguyen
    Phuong-Thai Nguyen
    Thi-Nga-My Dao
    Kieu-Hue Tran
    Khac-Quy Dinh
    [J]. PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 240 - 245
  • [6] An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention
    Li, Dongxing
    Luo, Zuying
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [7] Hybrid Attention-based Transformer for Long-range Document Classification
    Qin, Ruyu
    Huang, Min
    Liu, Jiawei
    Miao, Qinghai
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [8] Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation
    Zhang, Jinchao
    Wang, Mingxuan
    Liu, Qun
    Zhou, Jie
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1524 - 1534
  • [9] Machine Fault Detection Using a Hybrid CNN-LSTM Attention-Based Model
    Borre, Andressa
    Seman, Laio Oriel
    Camponogara, Eduardo
    Stefenon, Stefano Frizzo
    Mariani, Viviana Cocco
    Coelho, Leandro dos Santos
    [J]. SENSORS, 2023, 23 (09)
  • [10] Variational Attention-Based Interpretable Transformer Network for Rotary Machine Fault Diagnosis
    Li, Yasong
    Zhou, Zheng
    Sun, Chuang
    Chen, Xuefeng
    Yan, Ruqiang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 35 (05) : 6878 - 6892