Assessing Human-Parity in Machine Translation on the Segment Level

被引:0
|
作者
Graham, Yvette [1 ]
Federmann, Christian [2 ]
Eskevich, Maria [3 ]
Haddow, Barry [4 ]
机构
[1] Trinity Coll Dublin, ADAPT, Dublin, Ireland
[2] Microsoft Res, Redmond, WA USA
[3] CLARIN ERIC, Utrecht, Netherlands
[4] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent machine translation shared tasks have shown top-performing systems to tie or in some cases even outperform human translation. Such conclusions about system and human performance are, however, based on estimates aggregated from scores collected over large test sets of translations and so leave some remaining questions unanswered. For instance, simply because a system significantly outperforms the human translator on average may not necessarily mean that it has done so for every translation in the test set. Furthermore, are there remaining source segments present in evaluation test sets that cause significant challenges for top-performing systems and can such challenging segments go unnoticed due to the opacity of current human evaluation procedures ? To provide insight into these issues we carefully inspect the outputs of top-performing systems in the recent WMT19 news translation shared task for all language pairs in which a system either tied or outperformed human translation. Our analysis provides a new method of identifying the remaining segments for which either machine or human perform poorly. For example, in our close inspection of WMT19 English to German and German to English we discover the segments that disjointly proved a challenge for human and machine. For English to Russian, there were no segments included in our sample of translations that caused a significant challenge for the human translator, while we again identify the set of segments that caused issues for the top-performing system.
引用
收藏
页码:4199 / 4207
页数:9
相关论文
共 50 条
  • [41] Rethinking Document-level Neural Machine Translation
    Sun, Zewei
    Wang, Mingxuan
    Zhou, Hao
    Zhao, Chengqi
    Huang, Shujian
    Chen, Jiajun
    Li, Lei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3537 - 3548
  • [42] Document-Level Adaptation for Neural Machine Translation
    Kothur, Sachith Sri Ram
    Knowles, Rebecca
    Koehn, Philipp
    NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 64 - 73
  • [43] Literary Machine Translation under the Magnifying Glass: Assessing the Quality of an NMT-Translated Detective Novel on Document Level
    Fonteyne, Margot
    Tezcan, Arda
    Macken, Lieve
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3790 - 3798
  • [44] RETRACTED: The Relationship between Machine Translation and Human Translation under the Influence of Artificial Intelligence Machine Translation (Retracted Article)
    Zhao Lihua
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [45] Unsupervised Parallel Sentence Extraction with Parallel Segment Detection Helps Machine Translation
    Hangya, Viktor
    Fraser, Alexander
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1224 - 1234
  • [46] To Diverge or Not to Diverge: A Morphosyntactic Perspective on Machine Translation vs Human Translation
    Luo, Jiaming
    Cherry, Colin
    Foster, George
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 355 - 371
  • [47] Intermediality and Human vs. Machine Translation
    Huang, Harry J.
    CLCWEB-COMPARATIVE LITERATURE AND CULTURE, 2011, 13 (03):
  • [48] HUMAN VERSUS MACHINE TRANSLATION OF FOREIGN LANGUAGES
    LUFKIN, JM
    IEEE SPECTRUM, 1965, 2 (03) : 56 - &
  • [49] Translation in transition. Human and machine intelligence
    Seidl-Pech, Olivia
    ACROSS LANGUAGES AND CULTURES, 2024, 25 (02)
  • [50] Towards human linguistic machine translation evaluation
    Costa-jussa, Marta R.
    Farrus, Mireia
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2015, 30 (02) : 157 - 166