Analysing terminology translation errors in statistical and neural machine translation

被引:7
|
作者
Haque, Rejwanul [1 ]
Hasanuzzaman, Mohammed [1 ]
Way, Andy [1 ]
机构
[1] Dublin City Univ, ADAPT Ctr, Sch Comp, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Terminology translation; Machine translation; Phrase-based statistical machine translation; Neural machine translation; QUALITY;
D O I
10.1007/s10590-020-09251-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Terminology translation plays a critical role in domain-specific machine translation (MT). Phrase-based statistical MT (PB-SMT) has been the dominant approach to MT for the past 30 years, both in academia and industry. Neural MT (NMT), an end-to-end learning approach to MT, is steadily taking the place of PB-SMT. In this paper, we conduct comparative qualitative evaluation and comprehensive error analysis on terminology translation in PB-SMT and NMT in two translation directions: English-to-Hindi and Hindi-to-English. To the best of our knowledge, there is no gold standard available for evaluating terminology translation quality in MT. For this reason we select an evaluation test set from a legal domain corpus and create a gold standard for evaluating terminology translation in MT. We also propose an error typology taking the terminology translation errors in MT into consideration. We translate sentences of the test set with our MT systems and terminology translations are manually classified as per the error typology. We evaluate the MT system's performance on terminology translation, and demonstrate our findings, unraveling strengths, weaknesses, and similarities of PB-SMT and NMT in the area of term translation.
引用
收藏
页码:149 / 195
页数:47
相关论文
共 50 条
  • [21] An Investigation on Statistical Machine Translation with Neural Language Models
    Zhao, Yinggong
    Huang, Shujian
    Chen, Huadong
    Chen, Jiajun
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 175 - 186
  • [22] English-Basque Statistical and Neural Machine Translation
    Unanue, Inigo Jauregi
    Garmendia Arratibel, Lierni
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 880 - 885
  • [23] Statistical Machine Translation
    Cherry, Colin
    COMPUTATIONAL LINGUISTICS, 2010, 36 (04) : 773 - 776
  • [24] Statistical Machine Translation
    Zhang Xiaojun
    APPLIED LINGUISTICS, 2011, 32 (03) : 359 - 362
  • [25] Statistical machine translation based on translation rules
    Yulian, H.
    Journal of Chemical and Pharmaceutical Research, 2014, 6 (07) : 1628 - 1635
  • [26] Machine Translation of Electrical Terminology Constraints
    Wang, Zepeng
    Chen, Yuan
    Zhang, Juwei
    INFORMATION, 2023, 14 (09)
  • [27] Neural Machine Translation as a Novel Approach to Machine Translation
    Benkova, Lucia
    Benko, Lubomir
    DIVAI 2020: 13TH INTERNATIONAL SCIENTIFIC CONFERENCE ON DISTANCE LEARNING IN APPLIED INFORMATICS, 2020, : 499 - 508
  • [28] Contrastive Learning for Robust Neural Machine Translation with ASR Errors
    Hu, Dongyang
    Li, Junhui
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 81 - 91
  • [29] Neural Name Translation Improves Neural Machine Translation
    Li, Xiaoqing
    Yan, Jinghui
    Zhang, Jiajun
    Zong, Chengqing
    MACHINE TRANSLATION, CWMT 2018, 2019, 954 : 93 - 100
  • [30] Neural Machine Translation
    Birch, Alexandra
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (03) : 377 - 378