Analysing terminology translation errors in statistical and neural machine translation

被引:7
|
作者
Haque, Rejwanul [1 ]
Hasanuzzaman, Mohammed [1 ]
Way, Andy [1 ]
机构
[1] Dublin City Univ, ADAPT Ctr, Sch Comp, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Terminology translation; Machine translation; Phrase-based statistical machine translation; Neural machine translation; QUALITY;
D O I
10.1007/s10590-020-09251-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Terminology translation plays a critical role in domain-specific machine translation (MT). Phrase-based statistical MT (PB-SMT) has been the dominant approach to MT for the past 30 years, both in academia and industry. Neural MT (NMT), an end-to-end learning approach to MT, is steadily taking the place of PB-SMT. In this paper, we conduct comparative qualitative evaluation and comprehensive error analysis on terminology translation in PB-SMT and NMT in two translation directions: English-to-Hindi and Hindi-to-English. To the best of our knowledge, there is no gold standard available for evaluating terminology translation quality in MT. For this reason we select an evaluation test set from a legal domain corpus and create a gold standard for evaluating terminology translation in MT. We also propose an error typology taking the terminology translation errors in MT into consideration. We translate sentences of the test set with our MT systems and terminology translations are manually classified as per the error typology. We evaluate the MT system's performance on terminology translation, and demonstrate our findings, unraveling strengths, weaknesses, and similarities of PB-SMT and NMT in the area of term translation.
引用
收藏
页码:149 / 195
页数:47
相关论文
共 50 条
  • [1] Neural Machine Translation Advised by Statistical Machine Translation
    Wang, Xing
    Lu, Zhengdong
    Tu, Zhaopeng
    Li, Hang
    Xiong, Deyi
    Zhang, Min
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3330 - 3336
  • [2] Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
    Wang, Xing
    Tu, Zhaopeng
    Zhang, Min
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2255 - 2266
  • [3] Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
    Dugonik, Jani
    Maucec, Mirjam Sepesy
    Verber, Domen
    Brest, Janez
    [J]. MATHEMATICS, 2023, 11 (11)
  • [4] Terminology and machine translation
    Bell, Fiona
    Lemke, Mathias
    [J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2008, (06):
  • [5] Machine Translation and Welsh: Analysing free Statistical Machine Translation for the professional translation of an under-researched language pair
    Screen, Ben
    [J]. JOURNAL OF SPECIALISED TRANSLATION, 2017, (28): : 317 - 344
  • [6] Training Neural Machine Translation To Apply Terminology Constraints
    Dinu, Georgiana
    Mathur, Prashant
    Federico, Marcello
    Al-Onaizan, Yaser
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3063 - 3068
  • [7] Robust Neural Machine Translation with ASR Errors
    Xue, Haiyang
    Feng, Yang
    Gu, Shuhao
    Chen, Wei
    [J]. WORKSHOP ON AUTOMATIC SIMULTANEOUS TRANSLATION CHALLENGES, RECENT ADVANCES, AND FUTURE DIRECTIONS, 2020, : 15 - 23
  • [8] Encouraging Neural Machine Translation to Satisfy Terminology Constraints
    Ailem, Melissa
    Liu, Jinghsu
    Qader, Raheel
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1450 - 1455
  • [9] Lexical Diversity in Statistical and Neural Machine Translation
    Brglez, Mojca
    Vintar, Spela
    [J]. INFORMATION, 2022, 13 (02)
  • [10] Neural and statistical machine translation: perception and productivity
    Lopez Pereira, Ariana
    [J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2019, (17): : 1 - 19