Improving Chinese to English SMT with multiple CWS results

被引:0
|
作者
Ma, Yongliang [1 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, MOE Microsoft Key Lab Nat Language Proc & Speech, Harbin 150006, Peoples R China
关键词
Chinese word segmentation; SMT; feature blending; feature interpolation;
D O I
10.1109/IALP.2009.36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Chinese to English statistical machine translation (SMT), Chinese texts always need a pre-processing which segments sentences into words and this standard approach is Chinese word segmentation (CWS). However, CWS is not developed for SMT, its results are not necessarily optimal for SMT. In recent years, many investigations have been performed concerning making CWS suitable for SMT, but we explore it from another direction. In this paper, our basic idea is to use multiple CWS results as additional language knowledge sources and we present a simple and effective approach to use multiple CWS results for SMT. We also give experiment results over range of strategy settings, and obtain substantial improvements in performance for translation from Chinese to English. The best result shows we gain 1.89 BLEU percentage points over a state of the art HPBT baseline system without using multiple CWS results.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [1] Improving Egyptian-to-English SMT by Mapping Egyptian into MSA
    Durrani, Nadir
    Al-Onaizan, Yaser
    Ittycheriah, Abraham
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2014, PART II, 2014, 8404 : 271 - 282
  • [2] Experiments for various alignment models in Chinese-to-English SMT
    Zhou, Y
    Zong, CQ
    Xu, B
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 443 - 448
  • [3] Dependency-Enhanced Reordering Model for Chinese-English SMT
    Wang Miaomiao
    Xie Guo
    Du Jinhua
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 7094 - 7097
  • [4] Chinese-English SMT for Cross-language Dialogue Agent Support
    Duan, Xiangyu
    Zhang, Min
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [5] Improving Thai-English Word Alignment for Interrogative Sentences in SMT by Grammatical Knowledge
    Phodong, Kanyalag
    Kongkachandra, Rachada
    2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2017, : 226 - 231
  • [6] COMPARATIVE STUDY OF FACTORED SMT WITH BASELINE SMT FOR ENGLISH TO KANNADA
    Shivakumar, K. M.
    Shivaraju, N.
    Sreekanta, Vighnesh
    DeepaGupta
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 1, 2016, : 495 - 500
  • [7] Improving Haskell Types with SMT
    Diatchki, Iavor S.
    ACM SIGPLAN NOTICES, 2015, 50 (12) : 1 - 10
  • [9] Is English-medium instruction effective in improving Chinese undergraduate students' English competence?
    Lei, Jun
    Hu, Guangwei
    IRAL-INTERNATIONAL REVIEW OF APPLIED LINGUISTICS IN LANGUAGE TEACHING, 2014, 52 (02): : 99 - 126
  • [10] Improving complex SMT strategies with learning
    Galvez Ramirez, Nicolas
    Monfroy, Eric
    Saubion, Frederic
    Castro, Carlos
    INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH, 2020, 27 (02) : 1162 - 1188