A Relationship: Word Alignment, Phrase Table, and Translation Quality

被引:3
|
作者
Tian, Liang [1 ]
Wong, Derek F. [1 ]
Chao, Lidia S. [1 ]
Oliveira, Francisco [1 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Nat Language Proc & Portuguese Chinese Machine Tr, Taipa, Peoples R China
来源
关键词
D O I
10.1155/2014/438106
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the last years, researchers conducted several studies to evaluate the machine translation quality based on the relationship between word alignments and phrase table. However, existing methods usually employ ad-hoc heuristics without theoretical support. So far, there is no discussion from the aspect of providing a formula to describe the relationship among word alignments, phrase table, and machine translation performance. In this paper, on one hand, we focus on formulating such a relationship for estimating the size of extracted phrase pairs given one or more word alignment points. On the other hand, a corpus-motivated pruning technique is proposed to prune the default large phrase table. Experiment proves that the deduced formula is feasible, which not only can be used to predict the size of the phrase table, but also can be a valuable reference for investigating the relationship between the translation performance and phrase tables based on different links of word alignment. The corpus-motivated pruning results show that nearly 98% of phrases can be reduced without any significant loss in translation quality.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation
    Li, Zezhong
    Ikeda, Hideto
    Fukumoto, Junichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (07) : 1536 - 1543
  • [2] HMM word and phrase alignment for statistical machine translation
    Deng, Yonggang
    Byrne, William
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 494 - 507
  • [3] Phrase Table Combination Based on Symmetrization of Word Alignment for Low-Resource Languages
    Budiwati, Sari Dewi
    Siagian, Al Hafiz Akbar Maulana
    Fatyanosa, Tirana Noor
    Aritsugi, Masayoshi
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 20
  • [4] Measuring word alignment quality for statistical machine translation
    Fraser, Alexander
    Marcu, Daniel
    [J]. COMPUTATIONAL LINGUISTICS, 2007, 33 (03) : 293 - 303
  • [5] Maximum-entropy word alignment and posterior-based phrase extraction for machine translation
    Tomeh, Nadi
    Allauzen, Alexandre
    Yvon, Francois
    [J]. MACHINE TRANSLATION, 2014, 28 (01) : 19 - 56
  • [6] Phrase Alignment Confidence for Statistical Machine Translation
    Ananthakrishnan, Sankaranarayanan
    Prasad, Rohit
    Natarajan, Prem
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2878 - 2881
  • [7] Neural Machine Translation With Explicit Phrase Alignment
    Zhang, Jiacheng
    Luan, Huanbo
    Sun, Maosong
    Zhai, Feifei
    Xu, Jingfang
    Liu, Yang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1001 - 1010
  • [8] Phrase Table as Recommendation Memory for Neural Machine Translation
    Zhao, Yang
    Wang, Yining
    Zhang, Jiajun
    Zong, Chengqing
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4609 - 4615
  • [9] Effective phrase translation extraction from alignment models
    Venugopal, A
    Vogel, S
    Waibel, A
    [J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 319 - 326
  • [10] Statistical machine translation using hierarchical phrase alignment
    Watanabe, Taro
    Imamura, Kenji
    Sumita, Eiichiro
    Okuno, Hiroshi G.
    [J]. Systems and Computers in Japan, 2007, 38 (06) : 70 - 79