Phrase Alignment Confidence for Statistical Machine Translation

被引:0
|
作者
Ananthakrishnan, Sankaranarayanan [1 ]
Prasad, Rohit [1 ]
Natarajan, Prem [1 ]
机构
[1] BBN Technol, Speech & Language Proc Unit, Cambridge, MA USA
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of phrase-based statistical machine translation (SMT) systems is crucially dependent on the quality of the extracted phrase pairs, which is in turn a function of word alignment quality. Data sparsity, an inherent problem in SMT even with large training corpora, often has an adverse impact on the reliability of the extracted phrase translation pairs. In this paper, we present a novel feature based on bootstrap resampling of the training corpus, termed phrase alignment confidence, that measures the goodness of a phrase translation pair. We integrate this feature within a phrase-based SMT system and show an improvement of 1.7% BLEU and 4.4% METEOR over a baseline English-to-Pashto (E2P) SMT system that does not use any measure of phrase pair quality. We then show that the proposed measure compares well to an existing indicator of phrase pair reliability, the lexical smoothing probability. We also demonstrate that combining the two measures leads to a further improvement of 0.4% BLEU and 0.3% METEOR on the E2P system. Commensurate translation improvements are obtained on automatic speech recognition (ASR) transcripts of the source speech utterances.
引用
收藏
页码:2878 / 2881
页数:4
相关论文
共 50 条
  • [31] Improving Phrase-Based Statistical Machine Translation with Preprocessing Techniques
    Yashothara, S.
    Uthayasanker, R. T.
    Jayasena, S.
    [J]. 2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 322 - 327
  • [32] Improving phrase-based statistical machine translation with morphosyntactic transformation
    Thai Phuong Nguyen
    Shimazu, Akira
    [J]. MACHINE TRANSLATION, 2006, 20 (03) : 147 - 166
  • [33] Pivot language approach for phrase-based statistical machine translation
    Wu, Hua
    Wang, Haifeng
    [J]. MACHINE TRANSLATION, 2007, 21 (03) : 165 - 181
  • [34] Phrase-Based Tibetan-Chinese Statistical Machine Translation
    Yong Cuo
    Shi, Xiaodong
    Nyima, Tashi
    Chen, Yidong
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 424 - 427
  • [35] Improving Reordering Models with Phrase Number Feature for Statistical Machine Translation
    Noormohammadi, Neda
    Rahimi, Zahra
    Khadivi, Shahram
    [J]. ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING, AISP 2013, 2014, 427 : 227 - 233
  • [36] Modality-Preserving Phrase-Based Statistical Machine Translation
    Ideue, Masamichi
    Yamamoto, Kazuhide
    Utiyama, Masao
    Sumita, Eiichiro
    [J]. 2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 129 - 132
  • [37] Phrase-based Chinese-English Statistical Machine Translation
    Shi, Zijuan
    Luo, Gaofeng
    [J]. AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 3557 - 3560
  • [38] Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
    Xiong, Deyi
    Liu, Qun
    Lin, Shouxun
    [J]. COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 521 - 528
  • [39] Phrase-based statistical machine translation using approximate matching
    Tomas, Jesus
    Lloret, Jaime
    Casacuberta, Francisco
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 1, PROCEEDINGS, 2007, 4477 : 475 - +
  • [40] Slavic languages in phrase-based statistical machine translation: a survey
    Maucec, Mirjam Sepesy
    Brest, Janez
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2019, 51 (01) : 77 - 117