Improving Adversarial Neural Machine Translation for Morphologically Rich Language

被引:9
|
作者
Mi, Chenggang [1 ]
Xie, Lei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Neural machine translation (NMT); morphologically rich language; adversarial training; morphological word embedding; multiple references;
D O I
10.1109/TETCI.2019.2960546
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative adversarial networks (GAN) have great successes on natural language processing (NLP) and neural machine translation (NMT). However, the existing discriminator in GAN for NMT only combines two words as one query to train the translation models, which restrict the discriminator to be more meaningful and fail to apply rich monolingual information. Recent studies only consider one single reference translation during model training, this limit the GAN model to learn sufficient information about the representation of source sentence. These situations are even worse when languages are morphologically rich. In this article, an extended version of GAN model for neural machine translation is proposed to optimize the performance of morphologically rich language translation. In particular, we use the morphological word embedding instead of word embedding as input in GAN model to enrich the representation of words and overcome the data sparsity problem during model training. Moreover, multiple references are integrated into discriminator to make the model consider more context information and adapt to the diversity of different languages. Experimental results on German <-> English, French <-> English, Czech <-> English, Finnish <-> English, Turkish <-> English, Chinese <-> English, Finnish <-> Turkish and Turkish <-> Czech translation tasks demonstrate that our method achieves significant improvements over baseline systems.
引用
收藏
页码:417 / 426
页数:10
相关论文
共 50 条
  • [31] On integrating a language model into neural machine translation
    Gulcehre, Caglar
    Firat, Orhan
    Xu, Kelvin
    Cho, Kyunghyun
    Bengio, Yoshua
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 137 - 148
  • [32] Multilingual Neural Machine Translation with Language Clustering
    Tan, Xu
    Chen, Jiale
    He, Di
    Xia, Yingce
    Qin, Tao
    Liu, Tie-Yan
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 963 - 973
  • [33] Improving Phrase Chunking by using Contextualized Word Embeddings for a Morphologically Rich Language
    Toqeer Ehsan
    Javairia Khalid
    Saadia Ambreen
    Asad Mustafa
    Sarmad Hussain
    [J]. Arabian Journal for Science and Engineering, 2022, 47 : 9781 - 9799
  • [34] Neural machine translation and the indivisibility of culture and language
    Sanchez-Gijon, Pilar
    [J]. FORUM-REVUE INTERNATIONALE D INTERPRETATION ET DE TRADUCTION-INTERNATIONAL JOURNAL OF INTERPRETATION AND TRANSLATION, 2022, 20 (02): : 357 - 367
  • [35] Adversarial Training for Unknown Word Problems in Neural Machine Translation
    Ji, Yatu
    Hou, Hongxu
    Chen, Junjie
    Wu, Nier
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [36] Improving Phrase Chunking by using Contextualized Word Embeddings for a Morphologically Rich Language
    Ehsan, Toqeer
    Khalid, Javairia
    Ambreen, Saadia
    Mustafa, Asad
    Hussain, Sarmad
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 9781 - 9799
  • [37] Translating Between Morphologically Rich Languages: An Arabic-to-Turkish Machine Translation System
    El-Kahlout, Ilknur Durgar
    Bektas, Emre
    Erdem, Naime Seyma
    Kaya, Hamza
    [J]. FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 158 - 166
  • [38] INTRAWORD DECOMPOSITION IN A MORPHOLOGICALLY RICH LANGUAGE
    NIEMI, J
    LAINE, M
    KOIVUSELKASALLINEN, P
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 86 - 86
  • [39] Improving Neural Machine Translation with AMR Semantic Graphs
    Nguyen, Long H. B.
    Pham, Viet H.
    Dinh, Dien
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [40] Improving Real-time Recognition of Morphologically Rich Speech with Transformer Language Model
    Tarjan, Balazs
    Szaszak, Gyorgy
    Fegyo, Tibor
    Mihajlik, Peter
    [J]. 2020 11TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2020), 2020, : 491 - 495