Complex Technology of Machine Translation Resources Extension for the Kazakh Language

被引:1
|
作者
Rakhimova, Diana [1 ]
Zhumanov, Zhandos [1 ]
机构
[1] Al Farabi Kazakh Natl Univ, Lab Intelligent Informat Syst, Alma Ata, Kazakhstan
关键词
Linguistic resources; Low resources languages; Parallel corpora; Dictionaries; Transfer rules;
D O I
10.1007/978-3-319-56660-3_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper is devoted to creating linguistic resources such as parallel corpora, dictionaries and transfer rules for machine translation for low resources languages. We describe the usage of Bitextor tool for mining parallel corpora from online texts, usage of dictionary enrichment methodology so that people without deep linguistic knowledge could improve word dictionaries, and we show how transfer rules for machine translation can be automatically learned from a parallel corpus. All describe methods were applied to Kazakh, Russian and English languages with a task of machine translation between these languages in mind.
引用
收藏
页码:297 / 307
页数:11
相关论文
共 50 条
  • [1] THE TRANSLATION QUALITY PROBLEMS OF MACHINE TRANSLATION SYSTEMS FOR THE KAZAKH LANGUAGE
    Karibayeva, A.
    Karyukin, V
    Turgynbayeva, A.
    Turarbek, A.
    [J]. JOURNAL OF MATHEMATICS MECHANICS AND COMPUTER SCIENCE, 2021, 111 (03): : 132 - 140
  • [2] Semantic Connections in the Complex Sentences for Post-Editing Machine Translation in the Kazakh Language
    Turganbayeva, Aliya
    Rakhimova, Diana
    Karyukin, Vladislav
    Karibayeva, Aidana
    Turarbek, Asem
    [J]. INFORMATION, 2022, 13 (09)
  • [3] The solution of the problem of unknown words under neural machine translation of the Kazakh language
    Turganbayeva, Aliya
    Tukeyev, Ualsher
    [J]. JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2021, 5 (02) : 214 - 225
  • [4] Cascade Speech Translation for the Kazakh Language
    Kozhirbayev, Zhanibek
    Islamgozhayev, Talgat
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (15):
  • [5] Translation selection through machine learning with language resources
    Lee, Hyun Ah
    [J]. Computer Processing of Oriental Languages, Proceedings: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 370 - 377
  • [6] Neural machine translation system for the Kazakh language based on synthetic corpora<bold> </bold>
    Tukeyev, Ualsher
    Karibayeva, Aidana
    Abduali, Balzhan
    [J]. III INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN ENGINEERING SCIENCE (CMES 18), 2019, 252
  • [7] Statistical machine translation into a morphologically complex language
    Oflazer, Kemal
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 376 - 387
  • [8] Language Resource Extension for Indonesian-Chinese Machine Translation
    Liu, Wuying
    Xiao, Lixian
    Jiang, Shengyi
    Wang, Lin
    [J]. 2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 221 - 225
  • [9] Use of Domain-Specific Language Resources in Machine Translation
    Stajner, Sanja
    Querido, Andreia
    Rendeiro, Nuno
    Rodrigues, Joao
    Branco, Antonio
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 592 - 598
  • [10] The neural machine translation models for the low-resource Kazakh-English language pair
    Karyukin, Vladislav
    Rakhimova, Diana
    Karibayeva, Aidana
    Turganbayeva, Aliya
    Turarbek, Asem
    [J]. PEERJ COMPUTER SCIENCE, 2023, 9