Modelling lexical redundancy for machine translation

被引:0
|
作者
Talbot, David [1 ]
Osborne, Miles [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9LW, Midlothian, Scotland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of their distributions over target types. We propose a language-independent framework for minimising lexical redundancy that can be optimised directly from parallel text. Optimisation of the source lexicon for a given target language is viewed as model selection over a set of cluster-based translation models. Redundant distinctions between types may exhibit monolingual regularities, for example, inflexion patterns. We define a prior over model structure using a Markov random field and learn features over sets of monolingual types that are predictive of bilingual redundancy. The prior makes model selection more robust without the need for language-specific assumptions regarding redundancy. Using these models in a phrase-based SMT system, we show significant improvements in translation quality for certain language pairs.
引用
收藏
页码:969 / 976
页数:8
相关论文
共 50 条
  • [1] Lexical knowledge in machine translation
    Rolshoven, J
    [J]. LEXICAL ANALYSIS OF ROMANCE LANGUAGES, 1996, 353 : 85 - 100
  • [2] Contrastive Lexical Evaluation of Machine Translation
    Max, Aurelien
    Crego, Josep Maria
    Yvon, Francois
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1753 - 1757
  • [3] Lexical Diversity in Statistical and Neural Machine Translation
    Brglez, Mojca
    Vintar, Spela
    [J]. INFORMATION, 2022, 13 (02)
  • [4] Neural Machine Translation With Noisy Lexical Constraints
    Li, Huayang
    Huang, Guoping
    Cai, Deng
    Liu, Lemao
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1864 - 1874
  • [5] Use of lexical semantics in interlingual machine translation
    Dorr, Bonnie J.
    [J]. Machine Translation, 1992, 7 (03)
  • [6] Discriminative machine translation using global lexical selection
    Venkatapathy, Sriram
    Bangalore, Srinivas
    [J]. ACM Transactions on Asian Language Information Processing, 2009, 8 (02):
  • [7] Lexical micro-adaptation in statistical machine translation
    Crego, Josep Maria
    Leusch, Gregor
    Max, Aurelien
    Ney, Hermann
    Yvon, Francois
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2010, 51 (02): : 65 - 93
  • [8] Integrating Vectorized Lexical Constraints for Neural Machine Translation
    Wang, Shuo
    Tan, Zhixing
    LiU, Yang
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7063 - 7073
  • [9] Lexical Resources to Enrich English Malayalam Machine Translation
    Sreelekha, S.
    Bhattacharyya, Pushpak
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 620 - 627
  • [10] Enhancing Lexical Translation Consistency for Document-Level Neural Machine Translation
    Kang, Xiaomian
    Zhao, Yang
    Zhang, Jiajun
    Zong, Chengqing
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)