Machine Translation Evaluation: Manual Versus Automatic-A Comparative Study

被引：2

作者：

Maurya, Kaushal Kumar ^{[1
]}

Ravindran, Renjith P. ^{[1
]}

Anirudh, Ch Ram ^{[1
]}

Murthy, Kavi Narayana ^{[1
]}

机构：

[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad, India

来源：

DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT-2K19 | 2020年 / 1079卷

关键词：

Machine translation (MT); MT evaluation; Manual metrics; Automatic metrics;

D O I：

10.1007/978-981-15-1097-7_45

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The quality of machine translation (MT) is best judged by humans well versed in both source and target languages. However, automatic techniques are often used as these are much faster, cheaper and language independent. The goal of this paper is to check for correlation between manual and automatic evaluation, specifically in the context of Indian languages. To the extent automatic evaluation methods correlate with the manual evaluations, we can get the best of both worlds. In this paper, we perform a comparative study of automatic evaluation metrics-BLEU, NIST, METEOR, TER and WER, against the manual evaluation metric (adequacy), for English-Hindi translation. We also attempt to estimate the manual evaluation score of a given MToutput from its automatic evaluation score. The data for the study was sourced from the Workshop on Statistical Machine Translation WMT14.

引用

页码：541 / 553

页数：13

共 50 条

[31] AN EVALUATION OF COMPARATIVE EFFICIENCY OF MANUAL AND AUTOMATIC TOOTHBRUSHES IN MAINTAINING PERIODONTAL PATIENT
CHASENS, AI
MARCUS, RW
JOURNAL OF PERIODONTOLOGY, 1968, 39 (03) : 156 - &
[32] AUTOMATIC DICTIONARIES FOR MACHINE TRANSLATION
TAUBE, M
HEILPRIN, LB
PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1957, 45 (07): : 1020 - 1021
[33] Evaluation of the Efficacy of Manual Toothbrush Versus Power Toothbrush in Reduction of Gingivitis: A Comparative Clinical Study
Khan, Afia Anwar
Zehra, Fatima
Venkittu, Preethi
Thatchayani, I
Harishma, C., V
Shafna, Shafna
JOURNAL OF PHARMACY AND BIOALLIED SCIENCES, 2022, 14 : 1000 - 1003
[34] Detecting errors in machine translation using residuals and metrics of automatic evaluation
Munk, Michal
Munkova, Dasa
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3211 - 3223
[35] STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings
Li, Pairui
Chen, Chuan
Zheng, Wujie
Deng, Yuetang
Ye, Fanghua
Zheng, Zibin
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1497 - 1506
[36] A Comprehensive Survey on Various Fully Automatic Machine Translation Evaluation Metrics
Chauhan, Shweta
Daniel, Philemon
NEURAL PROCESSING LETTERS, 2023, 55 (09) : 12663 - 12717
[37] BLONDE: An Automatic Evaluation Metric for Document-level Machine Translation
Jiang, Yuchen Eleanor
Liu, Tianyu
Ma, Shuming
Zhang, Dongdong
Yang, Jian
Huang, Haoyang
Sennrich, Rico
Sachan, Mrinmaya
Cotterell, Ryan
Zhou, Ming
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1550 - 1565
[38] A Comprehensive Survey on Various Fully Automatic Machine Translation Evaluation Metrics
Shweta Chauhan
Philemon Daniel
Neural Processing Letters, 2023, 55 : 12663 - 12717
[39] Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation
Yoshimura, Ryoma
Shimanaka, Hiroki
Matsumura, Yukio
Yamagishi, Hayahide
Komachi, Mamoru
FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 521 - 525
[40] Comparative evaluation of online machine translation systems with legal texts
Kit, Chunyu
Wong, Tak Ming
LAW LIBRARY JOURNAL, 2008, 100 (02): : 299 - 321

← 1 2 3 4 5 →