Neural Machine Translation with Recurrent Highway Networks

被引:6
|
作者
Parmar, Maulik [1 ]
Devi, V. Susheela [1 ]
机构
[1] Indian Inst Sci, Bengaluru 560012, Karnataka, India
关键词
Recurrent highway networks; Reconstructor; Attention; Encoder-decoder;
D O I
10.1007/978-3-030-05918-7_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent Neural Networks have lately gained a lot of popularity in language modelling tasks, especially in neural machine translation (NMT). Very recent NMT models are based on Encoder-Decoder, where a deep LSTM based encoder is used to project the source sentence to a fixed dimensional vector and then another deep LSTM decodes the target sentence from the vector. However there has been very little work on exploring architectures that have more than one layer in space (i.e. in each time step). This paper examines the effectiveness of the simple Recurrent Highway Networks (RHN) in NMT tasks. The model uses Recurrent Highway Neural Network in encoder and decoder, with attention. We also explore the reconstructor model to improve adequacy. We demonstrate the effectiveness of all three approaches on the IWSLT English-Vietnamese dataset. We see that RHN performs on par with LSTM based models and even better in some cases. We see that deep RHN models are easy to train compared to deep LSTM based models because of highway connections. The paper also investigates the effects of increasing recurrent depth in each time step.
引用
收藏
页码:299 / 308
页数:10
相关论文
共 50 条
  • [1] Recurrent stacking of layers in neural networks: An application to neural machine translation
    Dabre, Prasanna Raj Noel
    Fujita, Atsushi
    arXiv, 2021,
  • [2] BILINGUAL RECURRENT NEURAL NETWORKS FOR IMPROVED STATISTICAL MACHINE TRANSLATION
    Zhao, Bing
    Tam, Yik-Cheung
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 66 - 70
  • [3] Machine Translation for Indian Languages Utilizing Recurrent Neural Networks and Attention
    Sharma, Sonali
    Diwakar, Manoj
    DISTRIBUTED COMPUTING AND OPTIMIZATION TECHNIQUES, ICDCOT 2021, 2022, 903 : 593 - 602
  • [4] Recurrent Attention for Neural Machine Translation
    Zeng, Jiali
    Wu, Shuangzhi
    Yin, Yongjing
    Jiang, Yufan
    Li, Mu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
  • [5] Variational Recurrent Neural Machine Translation
    Su, Jinsong
    Wu, Shan
    Xiong, Deyi
    Lu, Yaojie
    Han, Xianpei
    Zhang, Biao
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5488 - 5495
  • [6] Recurrent Positional Embedding for Neural Machine Translation
    Chen, Kehai
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1361 - 1367
  • [7] Machine translation evaluation with neural networks
    Guzman, Francisco
    Joty, Shafiq
    Marquez, Lluis
    Nakov, Preslav
    COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 180 - 200
  • [8] THE USE OF RECURRENT NEURAL NETWORKS LANGUAGE MODEL IN TURKISH-ENGLISH MACHINE TRANSLATION
    Yilmaz, Ertugrul
    El-Kahlout, Ilknur Durgar
    2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1247 - 1250
  • [9] A Bidirectional Recurrent Neural Language Model for Machine Translation
    Peris, Alvaro
    Casacuberta, Francisco
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (55): : 109 - 116
  • [10] A Recursive Recurrent Neural Network for Statistical Machine Translation
    Liu, Shujie
    Yang, Nan
    Li, Mu
    Zhou, Ming
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 1491 - 1500