Phrase table filtration based on virtual context in phrase-based statistical machine translation

被引:0
|
作者
Yin, Yue [1 ]
Zhang, Yu Jie [1 ]
Xu, Jin An [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
关键词
phrase-based statistical machine translation; filter phrase table; virtual context;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In statistical machine translation system, automatically extracted phrase table inevitably contains a large number of errors and redundant phrase pairs, which causes excessive waste of time and space in decoding and affects translation quality. In order to solve this problem, we propose a method for filtering phrase table based in which virtual context is introduced to calculate an incremental quantity in language model for score of phrase pair. By considering the maximum and minimum incremental quantity in score from the virtual context, we design a filtering strategy by re-ranking phrase pairs. We conducted experiments on NTCIR-9 data to verify the method. The experimental results show that when the size of phrase table was reduced to 47% of the original, the translation quality was improved slightly; when the size was reduced to 30% of the original, only slight decline occurred in translation quality. The experimental results indicate that this method can effectively filter out the redundant phrase pairs of the phrase table.
引用
收藏
页码:327 / 330
页数:4
相关论文
共 50 条
  • [1] Phrase-based statistical machine translation
    Zens, R
    Och, FJ
    Ney, H
    [J]. KI2002: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 2479 : 18 - 32
  • [2] Improvements in phrase-based statistical machine translation
    Zens, R
    Ney, H
    [J]. HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 257 - 264
  • [3] FACTORED PHRASE-BASED STATISTICAL MACHINE TRANSLATION
    Tufis, Dan
    Ceausu, Alexandru
    [J]. FROM SPEECH PROCESSING TO SPOKEN LANGUAGE TECHNOLOGY, 2009, : 115 - 124
  • [4] Syntactic phrase-based statistical machine translation
    Hassan, Hany
    Heame, Mary
    Way, Andy
    Sima'an, Khalil
    [J]. 2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 238 - +
  • [5] Statistical phrase-based translation
    Koehn, P
    Och, FJ
    Marcu, D
    [J]. HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 127 - 133
  • [6] Some improvements in phrase-based statistical machine translation
    Yang, Zhendong
    Pang, Wei
    Du, Jinhua
    Wei, Wei
    Xu, Bo
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 704 - +
  • [7] Phrase-based alignment models for statistical machine translation
    Tomás, J
    Lloret, J
    Casacuberta, F
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2005, 3523 : 605 - 613
  • [8] English to Bodo Phrase-Based Statistical Machine Translation
    Islam, Md Saiful
    Purkayastha, Bipul Syam
    [J]. ADVANCED COMPUTING AND COMMUNICATION TECHNOLOGIES, 2018, 562 : 207 - 217
  • [9] An overview of the phrase-based statistical machine translation techniques
    Ruiz Costa-Jussa, Marta
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2012, 27 (04): : 413 - 431
  • [10] Improvements in Statistical Phrase-Based Interactive Machine Translation
    Cai, Dongfeng
    Zhang, Hua
    Ye, Na
    [J]. 2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 91 - 94