Phrase table filtration based on virtual context in phrase-based statistical machine translation

被引:0
|
作者
Yin, Yue [1 ]
Zhang, Yu Jie [1 ]
Xu, Jin An [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
关键词
phrase-based statistical machine translation; filter phrase table; virtual context;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In statistical machine translation system, automatically extracted phrase table inevitably contains a large number of errors and redundant phrase pairs, which causes excessive waste of time and space in decoding and affects translation quality. In order to solve this problem, we propose a method for filtering phrase table based in which virtual context is introduced to calculate an incremental quantity in language model for score of phrase pair. By considering the maximum and minimum incremental quantity in score from the virtual context, we design a filtering strategy by re-ranking phrase pairs. We conducted experiments on NTCIR-9 data to verify the method. The experimental results show that when the size of phrase table was reduced to 47% of the original, the translation quality was improved slightly; when the size was reduced to 30% of the original, only slight decline occurred in translation quality. The experimental results indicate that this method can effectively filter out the redundant phrase pairs of the phrase table.
引用
收藏
页码:327 / 330
页数:4
相关论文
共 50 条
  • [41] A vector-space dynamic feature for phrase-based statistical machine translation
    Marta R. Costa-jussà
    Rafael E. Banchs
    [J]. Journal of Intelligent Information Systems, 2011, 37 : 139 - 154
  • [42] Pharaoh: A beam search decoder for phrase-based statistical machine translation models
    Koehn, P
    [J]. MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS, 2004, 3265 : 115 - 124
  • [43] A general framework to deal with the scaling problem in phrase-based statistical machine translation
    Ortiz, Daniel
    Varea, Ismael Garcia
    Casacuberta, Francisco
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2007, 4478 : 314 - +
  • [44] Learning local word reorderings for hierarchical phrase-based statistical machine translation
    Zhang, Jingyi
    Utiyama, Masao
    Sumita, Eiichro
    Zhao, Hai
    Neubig, Graham
    Nakamura, Satoshi
    [J]. MACHINE TRANSLATION, 2016, 30 (1-2) : 1 - 18
  • [45] Pseudo-word for Phrase-based Machine Translation
    Duan, Xiangyu
    Zhang, Min
    Li, Haizhou
    [J]. ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 148 - 156
  • [46] Czech-English phrase-based machine translation
    Bojar, Ondrej
    Matusov, Evgeny
    Ney, Hermann
    [J]. ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4139 : 214 - 224
  • [47] Phrase-based statistical machine translation by using reordering search and additional features
    Li, Miao
    Gao, Peng
    Zhang, Jian
    Luo, Yi
    [J]. COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 510 - 517
  • [48] Comparing and integrating alignment template and standard phrase-based statistical machine translation
    Xu, Lin
    Cao, Xiaoguang
    Zhang, Bufeng
    Li, Mu
    [J]. Computational Linguistics and Intelligent Text Processing, 2007, 4394 : 420 - 431
  • [49] Using collocation segmentation to extract translation units in a phrase-based statistical machine translation system
    Costa-jussa, Marta R.
    Daudaravicius, Vidas
    Banchs, Rafael E.
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 215 - 220
  • [50] BETTER STATISTICAL ESTIMATION CAN BENEFIT ALL PHRASES IN PHRASE-BASED STATISTICAL MACHINE TRANSLATION
    Sima'an, Khalil
    Mylonakis, Markos
    [J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 237 - 240