Evaluation of Context-dependent Phrasal Translation Lexicons for Statistical Machine Translation

被引:0
|
作者
Carpuat, Marine [1 ]
Wu, Dekai [1 ]
机构
[1] Univ Sci & Technol, Dept Comp Sci & Engn, Human Language Technol Ctr, HKUST, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We present new direct data analysis showing that dynamically-built context-dependent phrasal translation lexicons are more useful resources for phrase-based statistical machine translation (SMT) than conventional static phrasal translation lexicons, which ignore all contextual information. After several years of surprising negative results, recent work suggests that context-dependent phrasal translation lexicons are an appropriate framework to successfully incorporate Word Sense Disambiguation (WSD) modeling into SMT. However, this approach has so far only been evaluated using automatic translation quality metrics, which are important, but aggregate many different factors. A direct analysis is still needed to understand how context-dependent phrasal translation lexicons impact translation quality, and whether the additional complexity they introduce is really necessary. In this paper, we focus on the impact of context-dependent translation lexicons on lexical choice in phrase-based SMT and show that context-dependent lexicons are more useful to a phrase-based SMT system than a conventional lexicon. A typical phrase-based SMT system makes use of more and longer phrases with context modeling, including phrases that were not seen very frequently in training. Even when the segmentation is identical, the context-dependent lexicons yields translations that match references more often than conventional lexicons.
引用
收藏
页码:3520 / 3527
页数:8
相关论文
共 50 条
  • [1] Phrasal cohesion and statistical machine translation
    Fox, HJ
    PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2002, : 304 - 311
  • [2] Context-dependent word representation for neural machine translation
    Choi, Heeyoul
    Cho, Kyunghyun
    Bengio, Yoshua
    COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 149 - 160
  • [3] Distributional Phrasal Paraphrase Generation for Statistical Machine Translation
    Marton, Yuval
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 4 (03)
  • [4] Distributional phrasal paraphrase generation for statistical machine translation
    Marton, Y. (yuvalmarton@gmail.com), 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (43):
  • [5] Maximum entropy modeling:: A suitable framework to learn context-dependent lexicon models for statistical machine translation
    García-Varea, I
    Casacuberta, F
    MACHINE LEARNING, 2005, 60 (1-3) : 135 - 158
  • [6] Syntax-based reordering model for phrasal statistical machine translation
    Xue, Yong-Zeng
    Li, Sheng
    Zhao, Tie-Jun
    Yang, Mu-Yun
    Tongxin Xuebao/Journal on Communications, 2008, 29 (01): : 7 - 14
  • [7] Modeling Indicative Context for Statistical Machine Translation
    Wu, Shuangzhi
    Zhang, Dongdong
    Liu, Shujie
    Zhou, Ming
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 224 - 232
  • [8] A neural reordering model based on phrasal dependency tree for statistical machine translation
    Farzi, Saeed
    Faili, Heshaam
    Kianian, Sahar
    INTELLIGENT DATA ANALYSIS, 2018, 22 (05) : 1163 - 1183
  • [9] The design and evaluation of a Statistical Machine Translation syllabus for translation students
    Doherty, Stephen
    Kenny, Dorothy
    INTERPRETER AND TRANSLATOR TRAINER, 2014, 8 (02): : 295 - 315
  • [10] Context-Dependent Translation Selection Using Convolutional Neural Network
    Hu, Baotian
    Tu, Zhaopeng
    Lu, Zhengdong
    Li, Hang
    Chen, Qingcai
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 536 - 541