Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing

被引:0
|
作者
Shimorina, Anastasia [1 ,2 ]
Khasanova, Elena [1 ]
Gardent, Claire [2 ,3 ]
机构
[1] Lorraine Univ, Nancy, France
[2] LORIA, Nancy, France
[3] CNRS, Nancy, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an approach for semi-automatically creating a data-to-text (D2T) corpus for Russian that can be used to learn a D2T natural language generation model. An error analysis of the output of an English-to-Russian neural machine translation system shows that 80% of the automatically translated sentences contain an error and that 53% of all translation errors bear on named entities (NE). We therefore focus on named entities and introduce two post-editing techniques for correcting wrongly translated NEs.
引用
收藏
页码:44 / 49
页数:6
相关论文
共 50 条
  • [21] PET: a Tool for Post-editing and Assessing Machine Translation
    Aziz, Wilker
    de Sousa, Sheila C. M.
    Specia, Lucia
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3982 - 3987
  • [22] Ranking Machine Translation Systems via Post-editing
    Aziz, Wilker
    Mitkov, Ruslan
    Specia, Lucia
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 410 - 418
  • [23] Indices of cognitive effort in machine translation post-editing
    Vieira, Lucas Nunes
    [J]. MACHINE TRANSLATION, 2014, 28 (3-4) : 187 - 216
  • [24] The effectiveness of online queries in machine translation post-editing
    Zhang, Hong
    Torres-Hostench, Olga
    [J]. CIRCULO DE LINGUISTICA APLICADA A LA COMUNICACION, 2023, (93): : 289 - 303
  • [25] Mind the gap The nature of machine translation post-editing
    Rico, Celia
    [J]. BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION, 2022, 68 (05): : 697 - 722
  • [26] Translating without in-domain corpus: Machine translation post-editing with online learning techniques
    Lagarda, Antonio L.
    Ortiz-Martinez, Daniel
    Alabau, Vicent
    Casacuberta, Francisco
    [J]. COMPUTER SPEECH AND LANGUAGE, 2015, 32 (01): : 109 - 134
  • [27] Post-editing neural machine translation versus phrase-based machine translation for English-Chinese
    Jia, Yanfang
    Carl, Michael
    Wang, Xiangling
    [J]. MACHINE TRANSLATION, 2019, 33 (1-2) : 9 - 29
  • [28] Adaptation of Back-translation to Automatic Post-Editing for Synthetic Data Generation
    Lee, WonKee
    Jung, Baikjin
    Shin, Jaehun
    Lee, Jong-Hyeok
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3685 - 3691
  • [29] USING POST-EDITING IN TRANSLATION AND LSP COURSES
    Udina, Natalia
    [J]. PROCEEDINGS OF INTCESS 2019- 6TH INTERNATIONAL CONFERENCE ON EDUCATION AND SOCIAL SCIENCES, 2019, : 1097 - 1101
  • [30] System for Post-Editing and Automatic Error Classification of Machine Translation
    Munkova, Dasa
    Kapusta, Jozef
    Drlik, Martin
    [J]. DIVAI 2016: 11TH INTERNATIONAL SCIENTIFIC CONFERENCE ON DISTANCE LEARNING IN APPLIED INFORMATICS, 2016, : 571 - 579