Training Set Similarity Based Parameter Selection for Statistical Machine Translation

被引:0
|
作者
Shi, Xuewen [1 ]
Huang, Heyan [1 ]
Jian, Ping [1 ]
Tang, Yi-Kun [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing Engn Res Ctr High Volume Language Informa, Beijing 100081, Peoples R China
来源
WEB AND BIG DATA (APWEB-WAIM 2018), PT I | 2018年 / 10987卷
基金
中国国家自然科学基金;
关键词
Statistical machine translation; Log-linear model; Parameter selection;
D O I
10.1007/978-3-319-96890-2_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Log-linear model based statistical machine translation systems (SMT) are usually composed of multiple feature functions. Each feature function is assigned a weight as a model parameter. In this paper, we consider that different input source sentences may have discrepant needs for model parameters. To adapt the model to different inputs, we propose a model parameters selection method for log-linear model based SMT systems. The method is mainly based on the characteristics of different feature functions themselves without any assumption on unseen test sets. Experimental results on two language pairs (Zh-En and Ug-Zh) show that our method leads to the improvements up to 2.4 and 2.2 BLEU score respectively, and it also shows the good interpretability of our proposed method.
引用
收藏
页码:63 / 71
页数:9
相关论文
共 50 条
  • [41] Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation
    Li, Zezhong
    Ikeda, Hideto
    Fukumoto, Junichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (07) : 1536 - 1543
  • [42] Forced decoding for minimum error rate training in statistical machine translation
    Liang, Huashen
    Zhang, Min
    Zhao, Tiejun
    Journal of Computational Information Systems, 2012, 8 (02): : 861 - 868
  • [43] Parameter Differentiation Based Multilingual Neural Machine Translation
    Wang, Qian
    Zhang, Jiajun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11440 - 11448
  • [44] Feature selection based on the training set manipulation
    Krizek, Pavel
    Kittler, Josef
    Hlavac, Vaclav
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 658 - +
  • [45] Automatic extraction of synonyms based on statistical machine translation
    Dept. of Elect. and Elect. Engineering, HOSEI University, Kajinocho 3-7-2, Koganei, Tokyo, Japan
    Proc. Int. Conf. Tools Artif. Intell. ICTAI, 1600, (313-317):
  • [46] Statistical machine translation based on weighted syntax–semantics
    Debajyoty Banik
    Asif Ekbal
    Pushpak Bhattacharyya
    Sādhanā, 2020, 45
  • [47] Syntactic phrase-based statistical machine translation
    Hassan, Hany
    Heame, Mary
    Way, Andy
    Sima'an, Khalil
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 238 - +
  • [48] FACTORED PHRASE-BASED STATISTICAL MACHINE TRANSLATION
    Tufis, Dan
    Ceausu, Alexandru
    FROM SPEECH PROCESSING TO SPOKEN LANGUAGE TECHNOLOGY, 2009, : 115 - 124
  • [49] Statistical versus knowledge-based machine translation
    Wilks, Y
    Church, KW
    Nirenburg, S
    Hovy, EH
    Knoblock, CA
    IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1996, 11 (02): : 12 - 18
  • [50] Comparing example-based and statistical machine translation
    Way, Andy
    Gough, Nano
    Natural Language Engineering, 2005, 11 (03) : 295 - 309