Improving Chinese to English SMT with multiple CWS results

被引:0
|
作者
Ma, Yongliang [1 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, MOE Microsoft Key Lab Nat Language Proc & Speech, Harbin 150006, Peoples R China
关键词
Chinese word segmentation; SMT; feature blending; feature interpolation;
D O I
10.1109/IALP.2009.36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Chinese to English statistical machine translation (SMT), Chinese texts always need a pre-processing which segments sentences into words and this standard approach is Chinese word segmentation (CWS). However, CWS is not developed for SMT, its results are not necessarily optimal for SMT. In recent years, many investigations have been performed concerning making CWS suitable for SMT, but we explore it from another direction. In this paper, our basic idea is to use multiple CWS results as additional language knowledge sources and we present a simple and effective approach to use multiple CWS results for SMT. We also give experiment results over range of strategy settings, and obtain substantial improvements in performance for translation from Chinese to English. The best result shows we gain 1.89 BLEU percentage points over a state of the art HPBT baseline system without using multiple CWS results.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [21] Improving the usability of online information when translated from English to Chinese
    Fisher, Julie
    Chong, Janice
    IEEE Transactions on Professional Communication, 1996, 39 (03): : 122 - 128
  • [22] Improving SMT for Baltic Languages with Factored Models
    Skadins, Raivis
    Goba, Karlis
    Sics, Valters
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 125 - 132
  • [23] Implications of Multiple Test Forms of Chinese Matriculation English Tests
    Luo, Juan
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EDUCATION, LANGUAGE, ART AND INTER-CULTURAL COMMUNICATION (ICELAIC 2018), 2018, 289 : 235 - 237
  • [24] Intertextuality in Legal English Discourse and its Inspiration in Improving the Competence of Legal English Writing for Chinese Undergraduates
    Zhang Hongqin
    FOURTH INTERNATIONAL CONFERENCE ON LAW, LANGUAGE AND DISCOURSE (LLD), 2014, : 218 - 224
  • [25] BBR-CWS: Improving the Inter-Protocol Fairness of BBR
    Song, Yeong-Jun
    Kim, Geon-Hwan
    Cho, You-Ze
    ELECTRONICS, 2020, 9 (05)
  • [26] Experiments on domain adaptation for English-Hindi SMT
    Haque, Rejwanul
    Naskar, Sudip Kumar
    van Genabith, Josef
    Way, Andy
    PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, 2 : 670 - 677
  • [27] Improving Chinese EFL Teachers' English Requests: Does Study Abroad Help?
    Deng, Jun
    Ranta, Leila
    CANADIAN MODERN LANGUAGE REVIEW-REVUE CANADIENNE DES LANGUES VIVANTES, 2019, 75 (02): : 145 - 168
  • [28] A method for improving the accuracy of automatic indexing of Chinese-English mixed documents
    Yan ZHAO
    Hui SHI
    Journal of Data and Information Science, 2012, 5 (04) : 77 - 92
  • [29] Sibyl: Improving Software Engineering Tools with SMT Selection
    Leeson, Will
    Dwyer, Matthew B.
    Filieri, Antonio
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2185 - 2197
  • [30] Current Situation and Results on English Translation Research for Chinese Cultural Classics
    Li, Fenghua
    PROCEEDINGS OF THE 2015 3RD INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, ARTS, ECONOMICS AND SOCIAL SCIENCE, 2016, 49 : 984 - 990