The Effect of Translationese in Machine Translation Test Sets

被引:0
|
作者
Zhang, Mike [1 ]
Toral, Antonio [2 ]
机构
[1] Univ Groningen, Informat Sci Programme, Groningen, Netherlands
[2] Univ Groningen, Ctr Language & Cognit, Groningen, Netherlands
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The effect of translationese has been studied in the field of machine translation (MT), mostly with respect to training data. We study in depth the effect of translationese on test data, using the test sets from the last three editions of WMT's news shared task, containing 17 translation directions. We show evidence that (i) the use of translationese in test sets results in inflated human evaluation scores for MT systems; (ii) in some cases system rankings do change and (iii) the impact translationese has on a translation direction is inversely correlated to the translation quality attainable by state-of-the-art MT systems for that direction.
引用
收藏
页码:73 / 81
页数:9
相关论文
共 50 条
  • [1] Statistical Power and Translationese in Machine Translation Evaluation
    Graham, Yvette
    Haddow, Barry
    Koehn, Philipp
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 72 - 81
  • [2] Improving Statistical Machine Translation by Adapting Translation Models to Translationese
    Lembersky, Gennadi
    Ordan, Noam
    Wintner, Shuly
    [J]. COMPUTATIONAL LINGUISTICS, 2013, 39 (04) : 999 - 1024
  • [3] Machine Translationese: Effects of Algorithmic Bias on Linguistic Complexity in Machine Translation
    Vanmassenhove, Eva
    Shterionov, Dimitar
    Gwilliam, Matthew
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2203 - 2213
  • [4] Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation
    Pourdamghani, Nima
    Aldarrab, Nada
    Ghazvininejad, Marjan
    Knight, Kevin
    May, Jonathan
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3057 - 3062
  • [5] Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance
    Ni, Jingwei
    Jin, Zhijing
    Freitag, Markus
    Sachan, Mrinmaya
    Scholkopf, Bernhard
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5303 - 5320
  • [6] On Translationese in Translation of Political Classics
    Fang, Hui
    Yu, Dong
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON GLOBALIZATION: CHALLENGES FOR TRANSLATORS AND INTERPRETERS, VOL. I, 2017, : 307 - 312
  • [7] Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems
    Kuo, Chen-li
    [J]. DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2019, 34 (04) : 752 - 771
  • [8] Approaching translationese through parallel and translation corpora
    Schmied, J
    Schaffler, H
    [J]. SYNCHRONIC CORPUS LINGUISTICS, 1996, (16): : 41 - 56
  • [9] Translationese and interlanguage in inverse translation: A case study
    Yue, Ming
    Sun, Boyang
    [J]. ACROSS LANGUAGES AND CULTURES, 2021, 22 (01) : 45 - 63
  • [10] Identification of Translationese: A Machine Learning Approach
    Ilisei, Iustina
    Inkpen, Diana
    Pastor, Gloria Corpas
    Mitkov, Ruslan
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2010, 6008 : 503 - +