Training Set Similarity Based Parameter Selection for Statistical Machine Translation

被引:0
|
作者
Shi, Xuewen [1 ]
Huang, Heyan [1 ]
Jian, Ping [1 ]
Tang, Yi-Kun [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing Engn Res Ctr High Volume Language Informa, Beijing 100081, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Statistical machine translation; Log-linear model; Parameter selection;
D O I
10.1007/978-3-319-96890-2_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Log-linear model based statistical machine translation systems (SMT) are usually composed of multiple feature functions. Each feature function is assigned a weight as a model parameter. In this paper, we consider that different input source sentences may have discrepant needs for model parameters. To adapt the model to different inputs, we propose a model parameters selection method for log-linear model based SMT systems. The method is mainly based on the characteristics of different feature functions themselves without any assumption on unseen test sets. Experimental results on two language pairs (Zh-En and Ug-Zh) show that our method leads to the improvements up to 2.4 and 2.2 BLEU score respectively, and it also shows the good interpretability of our proposed method.
引用
收藏
页码:63 / 71
页数:9
相关论文
共 50 条
  • [1] A phrase similarity-based model for statistical machine translation
    He, Zhongjun
    Liu, Qun
    Lin, Shouxun
    Gaojishu Tongxin/Chinese High Technology Letters, 2009, 19 (04): : 337 - 341
  • [2] Bilingual Sense Similarity for Statistical Machine Translation
    Chen, Boxing
    Foster, George
    Kuhn, Roland
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 834 - 843
  • [3] Measuring Domain Similarity for Statistical Machine Translation
    Liu, Lin
    Cao, Hailong
    Zhao, Tiejun
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 611 - 615
  • [4] The Maximum Entropy based Rule Selection Model for Statistical Machine Translation
    Liu, Qun
    He, Zhongjun
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 89 - 96
  • [5] An online relevant set algorithm for statistical machine translation
    Tillmann, Christoph
    Zhang, Tong
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (07): : 1274 - 1286
  • [6] Text Genre - An Unexplored Parameter in Statistical Machine Translation
    Gavrila, Monica
    Vertan, Cristina
    HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2014, 8387 : 456 - 467
  • [7] Statistical machine translation based on translation rules
    Yulian, H.
    Journal of Chemical and Pharmaceutical Research, 2014, 6 (07) : 1628 - 1635
  • [8] Machine Translation of a Training Set for Semantic Extraction of Relations
    Pena-Torres, Jefferson A.
    Bucheli, Victor
    Gutierrez De Pinerez Reyes, Raul E.
    CUADERNOS DE LINGUISTICA HISPANICA, 2022, 39
  • [9] Bilingual recursive neural network based data selection for statistical machine translation
    Wong, Derek F.
    Lu, Yi
    Chao, Lidia S.
    KNOWLEDGE-BASED SYSTEMS, 2016, 108 : 15 - 24
  • [10] A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
    Logacheva, Varvara
    Specia, Lucia
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2690 - 2695