SemSyn: Semantic-Syntactic Similarity Based Automatic Machine Translation Evaluation Metric

被引:3
|
作者
Chauhan, Shweta [1 ]
Kumar, Rahul [2 ]
Saxena, Shefali [2 ]
Kaur, Amandeep [3 ]
Daniel, Philemon [2 ]
机构
[1] Chandigarh Univ, Univ Ctr Res & Dev Dept, Dept Elect & Commun Engn, Mohali 140413, Punjab, India
[2] Natl Inst Technol, Dept Elect & Commun Engn, Hamirpur 177005, Himachal Prades, India
[3] Indian Inst Informat Technol & Management Gwalior, Dept Management Studies, ABV, Gwalior 474015, Madhya Pradesh, India
关键词
Evaluation metric; Machine translation evaluation; Semantic; Syntactic; Word embeddings;
D O I
10.1080/03772063.2023.2195819
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Machine translation evaluation is difficult and challenging for natural languages because different languages behave differently for the same dataset. Lexical-based metrics have been poorly represented semantic relationships and impose strict identity matching. However, translation and assessment become difficult for target morphologically rich languages with relatively free word order. Most of the standard evaluation metrics consider word order but do not effectively consider sentence structure. In this paper, we propose a novel machine translation evaluation metric SemSyn which incorporates both semantic and syntactic similarity. We incorporate the term frequency-inverse document frequency with the earth mover's distance and word embedding to cover the semantic similarity. The part of speech and dependency parsing tags assist in covering syntactic similarity in the sentence structure. Part of speech and dependency parsing tags are extracted from universal dependencies and trained on the SpaCy library. Experimental results show that SemSyn has a higher correlation with human judgment than other evaluation metrics for morphologically rich language and other languages.
引用
收藏
页码:3823 / 3834
页数:12
相关论文
共 50 条
  • [1] Methodology for fuzzy duplicate record identification based on the semantic-syntactic information of similarity
    Hadzic, Djulaga
    Sarajlic, Nermin
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (01) : 126 - 136
  • [2] The METEOR metric for automatic evaluation of machine translation
    Lavie, Alon
    Denkowski, Michael J.
    MACHINE TRANSLATION, 2009, 23 (2-3) : 105 - 115
  • [3] STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings
    Li, Pairui
    Chen, Chuan
    Zheng, Wujie
    Deng, Yuetang
    Ye, Fanghua
    Zheng, Zibin
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1497 - 1506
  • [4] MEE : An Automatic Metric for Evaluation Using Embeddings for Machine Translation
    Mukherjee, Ananya
    Ala, Hema
    Shrivastava, Manish
    Sharma, Dipti Misra
    2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020), 2020, : 292 - 299
  • [5] MateTee: A Semantic Similarity Metric Based on Translation Embeddings for Knowledge Graphs
    Morales, Camilo
    Collarana, Diego
    Vidal, Maria-Esther
    Auer, Soeren
    WEB ENGINEERING (ICWE 2017), 2017, 10360 : 246 - 263
  • [6] Analysis of the Impact of Machine Translation Evaluation Metrics for Semantic Textual Similarity
    Magnolini, Simone
    Ngoc Phuoc An Vo
    Popescu, Octavian
    AI*IA 2016: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2016, 10037 : 450 - 463
  • [8] BLONDE: An Automatic Evaluation Metric for Document-level Machine Translation
    Jiang, Yuchen Eleanor
    Liu, Tianyu
    Ma, Shuming
    Zhang, Dongdong
    Yang, Jian
    Huang, Haoyang
    Sennrich, Rico
    Sachan, Mrinmaya
    Cotterell, Ryan
    Zhou, Ming
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1550 - 1565
  • [9] MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance
    Ma, Qingsong
    Meng, Fandong
    Zheng, Daqi
    Wang, Mingxuan
    Graham, Yvette
    Jiang, Wenbin
    Liu, Qun
    NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 153 - 161
  • [10] Semantic Evaluation of Machine Translation
    Wong, Billy Tak-Ming
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2884 - 2888