Multi-coverage Model for Neural Machine Translation

被引:0
|
作者
Liu J.-P. [1 ]
Huang K.-Y. [1 ]
Li J.-Y. [1 ]
Song D.-X. [1 ]
Huang D.-G. [1 ]
机构
[1] School of Computer Science and Technology, Dalian University of Technology, Dalian
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 03期
关键词
Attention mechanism; Multi-coverage model; Neural machine translation; Over-translation; Sequence-to-sequence model; Under-translation;
D O I
10.13328/j.cnki.jos.006201
中图分类号
学科分类号
摘要
The over-translation and under-translation problem in neural machine translation could be alleviated by coverage model. The existing methods usually store the coverage information in a single way, such as coverage vector or coverage score, but do not take the relationship among different coverage methods into consideration, leading to insufficient use of the information. This study proposes a multi-coverage mechanism based on the consistency of translation history and complementarity between different models. The concept of word-level coverage score is defined first, and then the coverage information stored in both coverage vector and coverage score are incorporated into the attention mechanism simultaneously, aiming to reduce the influence of information loss. According to different fusion methods, two models are introduced. Experiments are carried out on Chinese-to-English translation task based on sequence-to-sequence model. Results show that the proposed method could significantly enhance translation performance and improve alignment quality between source and target words. Compared with the model with coverage vector only, the number of over-translation and under-translation problem is further reduced. © Copyright 2022, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1141 / 1152
页数:11
相关论文
共 27 条
  • [1] Kalchbrenner N, Blunsom P., Recurrent continuous translation models, Proc. of the 2013 Conf. on Empirical Methods in Natural Language Processing, pp. 1700-1709, (2013)
  • [2] Sutskever I, Vinyals O, Le QV., Sequence to sequence learning with neural networks, Advances in Neural Information Processing Systems, pp. 3104-3112, (2014)
  • [3] Bahdanau D, Cho K, Bengio Y., Neural machine translation by jointly learning to align and translate, Proc. of the 3rd Int’l Conf. on Learning Representations, (2015)
  • [4] Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN., Convolutional sequence to sequence learning, Proc. of the 34th Int’l Conf. on Machine Learning, pp. 1243-1252, (2017)
  • [5] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I., Attention is all you need, Advances in Neural Information Processing Systems, pp. 5998-6008, (2017)
  • [6] Koehn P, Och FJ, Marcu D., Statistical phrase-based translation, Proc. of the 2003 Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 48-54, (2003)
  • [7] Li YC, Xiong DY, Zhang M., A survey of neural machine translation, Chinese Journal of Computers, 41, 12, pp. 2734-2755, (2018)
  • [8] Tu ZP, Lu ZD, Liu Y, Liu XH, Li H., Modeling coverage for neural machine translation, Proc. of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 76-85, (2016)
  • [9] Mi HT, Sankaran B, Wang ZG, Ittycheriah A., Coverage embedding models for neural machine translation, Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing, pp. 955-960, (2016)
  • [10] Feng S, Liu SJ, Yang N, Li M, Zhou M, Zhu KQ., Improving attention modeling with implicit distortion and fertility for machine translation, Proc. of the 26th Int’l Conf. on Computational Linguistics: Technical Papers (COLING 2016), pp. 3082-3092, (2016)