Improving Chinese to English SMT with multiple CWS results

被引:0
|
作者
Ma, Yongliang [1 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, MOE Microsoft Key Lab Nat Language Proc & Speech, Harbin 150006, Peoples R China
关键词
Chinese word segmentation; SMT; feature blending; feature interpolation;
D O I
10.1109/IALP.2009.36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Chinese to English statistical machine translation (SMT), Chinese texts always need a pre-processing which segments sentences into words and this standard approach is Chinese word segmentation (CWS). However, CWS is not developed for SMT, its results are not necessarily optimal for SMT. In recent years, many investigations have been performed concerning making CWS suitable for SMT, but we explore it from another direction. In this paper, our basic idea is to use multiple CWS results as additional language knowledge sources and we present a simple and effective approach to use multiple CWS results for SMT. We also give experiment results over range of strategy settings, and obtain substantial improvements in performance for translation from Chinese to English. The best result shows we gain 1.89 BLEU percentage points over a state of the art HPBT baseline system without using multiple CWS results.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [31] China English or Chinese English?
    Li, David C. S.
    ENGLISH TODAY, 2024, 40 (01) : 32 - 39
  • [32] China English or Chinese English
    Li Yiyang
    ENGLISH TODAY, 2019, 35 (02) : 3 - 12
  • [33] Alignment Model and Training Technique in SMT from English to Malayalam
    Sebastian, Mary Priya
    Kurian, K. Sheena
    Kumar, G. Santhosh
    CONTEMPORARY COMPUTING, PT 1, 2010, 94 : 305 - 315
  • [34] Improving Multiple Search Engines Retrieval Results Using Fusion
    Elleithy, Abdelrahman
    Wu, Zhengping
    2016 ANNUAL CONNECTICUT CONFERENCE ON INDUSTRIAL ELECTRONICS, TECHNOLOGY AND AUTOMATION (CT-IETA), 2016,
  • [36] Improving Compressive Sensing Results in Radar Using Multiple Reconstructions
    Wilsenach, Gregory
    Mishra, Amit Kumar
    2014 IEEE RADAR CONFERENCE, 2014, : 1283 - 1287
  • [37] Reordering of Source Side for a Factored English to Manipuri SMT System
    Maibam, Indika
    Purkayastha, Bipul Syam
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (03) : 285 - 292
  • [38] Chinese question-answering: Comparing monolingual with English-Chinese cross-lingual results
    Kwok, Kui-Lam
    Deng, Peter
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 244 - 257
  • [39] Improving Chinese-English Neural Machine Translation with Detected Usages of Function Words
    Zhang, Kunli
    Xu, Hongfei
    Xiong, Deyi
    Liu, Qiuhui
    Zan, Hongying
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 741 - 749
  • [40] Improving Chinese students' English reading through graded readers Rationale, strategies and effectiveness
    Wang Qiang
    Chen Zehang
    Qi Xianglin
    LANGUAGE TEACHING FOR YOUNG LEARNERS, 2020, 2 (02) : 262 - 301