xcomet: Transparent Machine Translation Evaluation through Fine-grained Error Detection

被引:0
|
作者
Guerreiro, Nuno M. [1 ,3 ,4 ,5 ]
Rei, Ricardo [1 ,2 ,5 ]
van Stigt, Daan [1 ]
Coheur, Luisa [2 ,5 ]
Colombo, Pierre [4 ]
Martins, Andre F. T. [1 ,3 ,5 ]
机构
[1] Unbabel Lisbon, Lisbon, Portugal
[2] INESC ID, Lisbon, Portugal
[3] Inst Telecomunicacoes, Lisbon, Portugal
[4] Univ Paris Saclay, MICS, Cent Supelec, Paris, France
[5] Univ Lisbon, Inst Super Tecn, Lisbon, Portugal
基金
欧洲研究理事会;
关键词
Compendex;
D O I
10.1162/tacl_a_00683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Widely used learned metrics for machine translation evaluation, such as Comet and Bleurt, estimate the quality of a translation hypothesis by providing a single sentence-level score. As such, they offer little insight into translation errors (e.g., what are the errors and what is their severity). On the other hand, generative large language models (LLMs) are amplifying the adoption of more granular strategies to evaluation, attempting to detail and categorize translation errors. In this work, we introduce xcomet, an open-source learned metric designed to bridge the gap between these approaches. xcomet integrates both sentence-level evaluation and error span detection capabilities, exhibiting state-of-the-art performance across all types of evaluation (sentence-level, system-level, and error span detection). Moreover, it does so while highlighting and categorizing error spans, thus enriching the quality assessment. We also provide a robustness analysis with stress tests, and show that xcomet is largely capable of identifying localized critical errors and hallucinations.
引用
下载
收藏
页码:979 / 995
页数:17
相关论文
共 50 条
  • [1] Fine-grained attention mechanism for neural machine translation
    Choi, Heeyoul
    Cho, Kyunghyun
    Bengio, Yoshua
    NEUROCOMPUTING, 2018, 284 : 171 - 176
  • [2] Automatic Reference-Free Fine-Grained Machine Translation Error Detection via Named Entity Recognition and Back-Translation
    Yan, Yiting
    Song, Jiaxin
    Fu, Biao
    Ye, Na
    Shi, Xiaodong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 306 - 317
  • [3] Fine-grained Evaluation on Face Detection in the Wild
    Yang, Bin
    Yan, Junjie
    Lei, Zhen
    Li, Stan Z.
    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1, 2015,
  • [4] Enhancing Machine Translation Quality Estimation via Fine-Grained Error Analysis and Large Language Model
    Jung, Dahyun
    Park, Chanjun
    Eo, Sugyeong
    Lim, Heuiseok
    MATHEMATICS, 2023, 11 (19)
  • [5] Transparent fine-grained oxide ceramics
    West, GD
    Perkins, JM
    Lewis, MH
    EURO CERAMICS VIII, PTS 1-3, 2004, 264-268 : 801 - 804
  • [6] Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian
    Klubicka, Filip
    Toral, Antonio
    Sanchez-Cartagena, Victor M.
    MACHINE TRANSLATION, 2018, 32 (03) : 195 - 215
  • [7] FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation
    Zhu, Wenhao
    Huang, Shujian
    Pu, Tong
    Huang, Pingxuan
    Zhang, Xu
    Yu, Jian
    Chen, Wei
    Wang, Yanfeng
    Chen, Jiajun
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6719 - 6727
  • [8] Fine-Grained Error Analysis and Fair Evaluation of Labeled Spans
    Ortmann, Katrin
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1400 - 1407
  • [9] The Lazy Encoder: A Fine-Grained Analysis of the Role of Morphology in Neural Machine Translation
    Bisazza, Arianna
    Tump, Clara
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2871 - 2876
  • [10] A Fine-grained Fault Detection Technique Based on Virtual Machine Monitor
    Liu, Kun
    Wo, Tianyu
    Cui, Lei
    2013 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CLOUDCOM-ASIA), 2013, : 275 - 282