Neural Machine Translation with Recurrent Highway Networks

被引：6

作者：

Parmar, Maulik ^{[1
]}

Devi, V. Susheela ^{[1
]}

机构：

[1] Indian Inst Sci, Bengaluru 560012, Karnataka, India

来源：

MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION, MIKE 2018 | 2018年 / 11308卷

关键词：

Recurrent highway networks; Reconstructor; Attention; Encoder-decoder;

D O I：

10.1007/978-3-030-05918-7_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent Neural Networks have lately gained a lot of popularity in language modelling tasks, especially in neural machine translation (NMT). Very recent NMT models are based on Encoder-Decoder, where a deep LSTM based encoder is used to project the source sentence to a fixed dimensional vector and then another deep LSTM decodes the target sentence from the vector. However there has been very little work on exploring architectures that have more than one layer in space (i.e. in each time step). This paper examines the effectiveness of the simple Recurrent Highway Networks (RHN) in NMT tasks. The model uses Recurrent Highway Neural Network in encoder and decoder, with attention. We also explore the reconstructor model to improve adequacy. We demonstrate the effectiveness of all three approaches on the IWSLT English-Vietnamese dataset. We see that RHN performs on par with LSTM based models and even better in some cases. We see that deep RHN models are easy to train compared to deep LSTM based models because of highway connections. The paper also investigates the effects of increasing recurrent depth in each time step.

引用

页码：299 / 308

页数：10

共 50 条

[1] Recurrent stacking of layers in neural networks: An application to neural machine translation
Dabre, Prasanna Raj Noel
Fujita, Atsushi
arXiv, 2021,
[2] BILINGUAL RECURRENT NEURAL NETWORKS FOR IMPROVED STATISTICAL MACHINE TRANSLATION
Zhao, Bing
Tam, Yik-Cheung
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 66 - 70
[3] Machine Translation for Indian Languages Utilizing Recurrent Neural Networks and Attention
Sharma, Sonali
Diwakar, Manoj
DISTRIBUTED COMPUTING AND OPTIMIZATION TECHNIQUES, ICDCOT 2021, 2022, 903 : 593 - 602
[4] Recurrent Attention for Neural Machine Translation
Zeng, Jiali
Wu, Shuangzhi
Yin, Yongjing
Jiang, Yufan
Li, Mu
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
[5] Variational Recurrent Neural Machine Translation
Su, Jinsong
Wu, Shan
Xiong, Deyi
Lu, Yaojie
Han, Xianpei
Zhang, Biao
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5488 - 5495
[6] Recurrent Positional Embedding for Neural Machine Translation
Chen, Kehai
Wang, Rui
Utiyama, Masao
Sumita, Eiichiro
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1361 - 1367
[7] Machine translation evaluation with neural networks
Guzman, Francisco
Joty, Shafiq
Marquez, Lluis
Nakov, Preslav
COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 180 - 200
[8] THE USE OF RECURRENT NEURAL NETWORKS LANGUAGE MODEL IN TURKISH-ENGLISH MACHINE TRANSLATION
Yilmaz, Ertugrul
El-Kahlout, Ilknur Durgar
2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1247 - 1250
[9] A Bidirectional Recurrent Neural Language Model for Machine Translation
Peris, Alvaro
Casacuberta, Francisco
PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (55): : 109 - 116
[10] A Recursive Recurrent Neural Network for Statistical Machine Translation
Liu, Shujie
Yang, Nan
Li, Mu
Zhou, Ming
PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 1491 - 1500

← 1 2 3 4 5 →