Machine Translation Using Improved Attention-based Transformer with Hybrid Input

被引：0

作者：

Abrishami, Mahsa ^{[1
]}

Rashti, Mohammad Javad ^{[1
]}

Naderan, Marjan ^{[1
]}

机构：

[1] Shahid Chamran Univ Ahvaz, Dept Comp Engn, Ahvaz, Iran

来源：

2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR) | 2020年

关键词：

Neural Machine Translation; Self-Attention; Multi-Head Attention; Encoder-Decoder;

D O I：

10.1109/icwr49608.2020.9122317

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine Translation (MT) refers to the automated software-based translation of natural language text. The embedded complexities and incompatibilities of natural languages have made MT a daunting task facing numerous challenges, especially when it is to be compared to a manual translation. With the emergence of deep-learning AI approaches, the Neural Machine Translation (NMT) has pushed MT results closer to human expectations. One of the newest deep learning approaches is the sequence-to-sequence approach based on Recurrent Neural Networks (RNN), complex convolutions, and transformers, and employing encoders/decoder pairs. In this study, an attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders. The main contributions of the proposed model lie in the weighted combination of layers' primary input and output of the previous layers, feeding into the next layer. This mechanism results in a more accurate transformation compared to non-hybrid inputs. The model is evaluated using two datasets for German/English translation, the WMT'14 dataset for training, and the newstest' 2012 dataset for testing. The experiments are run on GPU-equipped Google Colab instances and the results show an accuracy of 36.7 BLEU, a 5% improvement over the previous work without the hybrid-input technique.

引用

页码：52 / 57

页数：6

共 50 条

[1] Recursive Annotations for Attention-Based Neural Machine Translation
Ye, Shaolin
Guo, Wu
[J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 164 - 167
[2] CRAN: An Hybrid CNN-RNN Attention-Based Model for Arabic Machine Translation
Bensalah, Nouhaila
Ayad, Habib
Adib, Abdellah
El Farouk, Abdelhamid Ibn
[J]. NETWORKING, INTELLIGENT SYSTEMS AND SECURITY, 2022, 237 : 87 - 102
[3] Face-based age estimation using improved Swin Transformer with attention-based convolution
Shi, Chaojun
Zhao, Shiwei
Zhang, Ke
Wang, Yibo
Liang, Longping
[J]. FRONTIERS IN NEUROSCIENCE, 2023, 17
[4] Neural Machine Translation Models with Attention-Based Dropout Layer
Israr, Huma
Khan, Safdar Abbas
Tahir, Muhammad Ali
Shahzad, Muhammad Khuram
Ahmad, Muneer
Zain, Jasni Mohamad
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2981 - 3009
[5] An Effective Coverage Approach for Attention-based Neural Machine Translation
Hoang-Quan Nguyen
Thuan-Minh Nguyen
Huy-Hien Vu
Van-Vinh Nguyen
Phuong-Thai Nguyen
Thi-Nga-My Dao
Kieu-Hue Tran
Khac-Quy Dinh
[J]. PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 240 - 245
[6] An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention
Li, Dongxing
Luo, Zuying
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[7] Hybrid Attention-based Transformer for Long-range Document Classification
Qin, Ruyu
Huang, Min
Liu, Jiawei
Miao, Qinghai
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[8] Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation
Zhang, Jinchao
Wang, Mingxuan
Liu, Qun
Zhou, Jie
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1524 - 1534
[9] Machine Fault Detection Using a Hybrid CNN-LSTM Attention-Based Model
Borre, Andressa
Seman, Laio Oriel
Camponogara, Eduardo
Stefenon, Stefano Frizzo
Mariani, Viviana Cocco
Coelho, Leandro dos Santos
[J]. SENSORS, 2023, 23 (09)
[10] Variational Attention-Based Interpretable Transformer Network for Rotary Machine Fault Diagnosis
Li, Yasong
Zhou, Zheng
Sun, Chuang
Chen, Xuefeng
Yan, Ruqiang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 35 (05) : 6878 - 6892

← 1 2 3 4 5 →