Machine Translation Evaluation: Manual Versus Automatic-A Comparative Study

被引:2
|
作者
Maurya, Kaushal Kumar [1 ]
Ravindran, Renjith P. [1 ]
Anirudh, Ch Ram [1 ]
Murthy, Kavi Narayana [1 ]
机构
[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad, India
来源
DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT-2K19 | 2020年 / 1079卷
关键词
Machine translation (MT); MT evaluation; Manual metrics; Automatic metrics;
D O I
10.1007/978-981-15-1097-7_45
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quality of machine translation (MT) is best judged by humans well versed in both source and target languages. However, automatic techniques are often used as these are much faster, cheaper and language independent. The goal of this paper is to check for correlation between manual and automatic evaluation, specifically in the context of Indian languages. To the extent automatic evaluation methods correlate with the manual evaluations, we can get the best of both worlds. In this paper, we perform a comparative study of automatic evaluation metrics-BLEU, NIST, METEOR, TER and WER, against the manual evaluation metric (adequacy), for English-Hindi translation. We also attempt to estimate the manual evaluation score of a given MToutput from its automatic evaluation score. The data for the study was sourced from the Workshop on Statistical Machine Translation WMT14.
引用
收藏
页码:541 / 553
页数:13
相关论文
共 50 条
  • [31] AN EVALUATION OF COMPARATIVE EFFICIENCY OF MANUAL AND AUTOMATIC TOOTHBRUSHES IN MAINTAINING PERIODONTAL PATIENT
    CHASENS, AI
    MARCUS, RW
    JOURNAL OF PERIODONTOLOGY, 1968, 39 (03) : 156 - &
  • [32] AUTOMATIC DICTIONARIES FOR MACHINE TRANSLATION
    TAUBE, M
    HEILPRIN, LB
    PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1957, 45 (07): : 1020 - 1021
  • [33] Evaluation of the Efficacy of Manual Toothbrush Versus Power Toothbrush in Reduction of Gingivitis: A Comparative Clinical Study
    Khan, Afia Anwar
    Zehra, Fatima
    Venkittu, Preethi
    Thatchayani, I
    Harishma, C., V
    Shafna, Shafna
    JOURNAL OF PHARMACY AND BIOALLIED SCIENCES, 2022, 14 : 1000 - 1003
  • [34] Detecting errors in machine translation using residuals and metrics of automatic evaluation
    Munk, Michal
    Munkova, Dasa
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3211 - 3223
  • [35] STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings
    Li, Pairui
    Chen, Chuan
    Zheng, Wujie
    Deng, Yuetang
    Ye, Fanghua
    Zheng, Zibin
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1497 - 1506
  • [36] A Comprehensive Survey on Various Fully Automatic Machine Translation Evaluation Metrics
    Chauhan, Shweta
    Daniel, Philemon
    NEURAL PROCESSING LETTERS, 2023, 55 (09) : 12663 - 12717
  • [37] BLONDE: An Automatic Evaluation Metric for Document-level Machine Translation
    Jiang, Yuchen Eleanor
    Liu, Tianyu
    Ma, Shuming
    Zhang, Dongdong
    Yang, Jian
    Huang, Haoyang
    Sennrich, Rico
    Sachan, Mrinmaya
    Cotterell, Ryan
    Zhou, Ming
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1550 - 1565
  • [38] A Comprehensive Survey on Various Fully Automatic Machine Translation Evaluation Metrics
    Shweta Chauhan
    Philemon Daniel
    Neural Processing Letters, 2023, 55 : 12663 - 12717
  • [39] Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation
    Yoshimura, Ryoma
    Shimanaka, Hiroki
    Matsumura, Yukio
    Yamagishi, Hayahide
    Komachi, Mamoru
    FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 521 - 525
  • [40] Comparative evaluation of online machine translation systems with legal texts
    Kit, Chunyu
    Wong, Tak Ming
    LAW LIBRARY JOURNAL, 2008, 100 (02): : 299 - 321