Phrase Alignment Confidence for Statistical Machine Translation

被引:0
|
作者
Ananthakrishnan, Sankaranarayanan [1 ]
Prasad, Rohit [1 ]
Natarajan, Prem [1 ]
机构
[1] BBN Technol, Speech & Language Proc Unit, Cambridge, MA USA
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of phrase-based statistical machine translation (SMT) systems is crucially dependent on the quality of the extracted phrase pairs, which is in turn a function of word alignment quality. Data sparsity, an inherent problem in SMT even with large training corpora, often has an adverse impact on the reliability of the extracted phrase translation pairs. In this paper, we present a novel feature based on bootstrap resampling of the training corpus, termed phrase alignment confidence, that measures the goodness of a phrase translation pair. We integrate this feature within a phrase-based SMT system and show an improvement of 1.7% BLEU and 4.4% METEOR over a baseline English-to-Pashto (E2P) SMT system that does not use any measure of phrase pair quality. We then show that the proposed measure compares well to an existing indicator of phrase pair reliability, the lexical smoothing probability. We also demonstrate that combining the two measures leads to a further improvement of 0.4% BLEU and 0.3% METEOR on the E2P system. Commensurate translation improvements are obtained on automatic speech recognition (ASR) transcripts of the source speech utterances.
引用
收藏
页码:2878 / 2881
页数:4
相关论文
共 50 条
  • [1] HMM word and phrase alignment for statistical machine translation
    Deng, Yonggang
    Byrne, William
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 494 - 507
  • [2] Statistical machine translation using hierarchical phrase alignment
    Watanabe, Taro
    Imamura, Kenji
    Sumita, Eiichiro
    Okuno, Hiroshi G.
    [J]. Systems and Computers in Japan, 2007, 38 (06) : 70 - 79
  • [3] Integrated phrase segmentation and alignment algorithm for Statistical Machine Translation
    Zhang, Y
    Vogel, S
    Waibel, A
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 567 - 573
  • [4] Phrase-based alignment models for statistical machine translation
    Tomás, J
    Lloret, J
    Casacuberta, F
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2005, 3523 : 605 - 613
  • [5] Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation
    Li, Zezhong
    Ikeda, Hideto
    Fukumoto, Junichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (07) : 1536 - 1543
  • [6] Neural Machine Translation With Explicit Phrase Alignment
    Zhang, Jiacheng
    Luan, Huanbo
    Sun, Maosong
    Zhai, Feifei
    Xu, Jingfang
    Liu, Yang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1001 - 1010
  • [7] Comparing and integrating alignment template and standard phrase-based statistical machine translation
    Xu, Lin
    Cao, Xiaoguang
    Zhang, Bufeng
    Li, Mu
    [J]. Computational Linguistics and Intelligent Text Processing, 2007, 4394 : 420 - 431
  • [8] Phrase-based statistical machine translation
    Zens, R
    Och, FJ
    Ney, H
    [J]. KI2002: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 2479 : 18 - 32
  • [9] Statistical machine translation decoder based on phrase
    ATR Spoken Language Translation Research Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto
    619-0288, Japan
    不详
    606-8501, Japan
    [J]. Int. Conf. Spok. Lang. Process., ICSLP, (1889-1892):
  • [10] Improvements in phrase-based statistical machine translation
    Zens, R
    Ney, H
    [J]. HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 257 - 264