Evaluation of Dictionary Creating Methods for Under-Resourced Languages

被引:1
|
作者
Simon, Eszter [1 ]
Mittelholcz, Ivan [1 ]
机构
[1] Hungarian Acad Sci, Res Inst Linguist, Benczur U 33, H-1068 Budapest, Hungary
来源
基金
匈牙利科学研究基金会;
关键词
Bilingual dictionaries; Evaluation; Under-resourced languages; Dictionary building methods;
D O I
10.1007/978-3-319-64206-2_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present several bilingual dictionary building methods applied for Northern Saami-{English, Finnish, Hungarian, Russian} language pairs. Since Northern Saami is an under-resourced language and standard dictionary building methods require a large amount of pre-processed data, we had to find alternative methods. In a thorough evaluation, we compared the results for each method, which proved our expectations that the precision of standard lexicon building methods is quite low. The most precise method is utilizing Wikipedia title pairs extracted via inter-language links, but Wiktionary-based methods also provided useful result.
引用
收藏
页码:246 / 254
页数:9
相关论文
共 50 条
  • [1] Eigentrigraphemes for under-resourced languages
    Ko, Tom
    Mak, Brian
    [J]. SPEECH COMMUNICATION, 2014, 56 : 132 - 141
  • [2] The LREMap for Under-Resourced Languages
    Del Gratta, Riccardo
    Frontini, Francesca
    Khan, Anas Fahad
    Mariani, Joseph
    Soria, Claudia
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [3] Creating language resources for under-resourced languages: methodologies, and experiments with Arabic
    El-Haj, Mahmoud
    Kruschwitz, Udo
    Fox, Chris
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2015, 49 (03) : 549 - 580
  • [4] Creating language resources for under-resourced languages: methodologies, and experiments with Arabic
    Mahmoud El-Haj
    Udo Kruschwitz
    Chris Fox
    [J]. Language Resources and Evaluation, 2015, 49 : 549 - 580
  • [5] Automatic processing of under-resourced languages
    Bernhard, Delphine
    Soria, Claudia
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2018, 59 (03): : 7 - 14
  • [6] ASR and translation for under-resourced languages
    Besacier, L.
    Le, V-B.
    Boitet, C.
    Berment, V.
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 6079 - 6082
  • [7] A Collection of Comparable Corpora for Under-resourced Languages
    Skadina, Inguna
    Aker, Ahmet
    Giouli, Voula
    Tufis, Dan
    Gaizauskas, Robert
    Mierina, Madara
    Mastropavlos, Nikos
    [J]. HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 161 - 168
  • [8] Modeling under-resourced languages for speech recognition
    Kurimo, Mikko
    Enarvi, Seppo
    Tilk, Ottokar
    Varjokallio, Matti
    Mansikkaniemi, Andre
    Alumae, Tanel
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2017, 51 (04) : 961 - 987
  • [9] A Modular and Automated Annotation Platform for Handwritings: Evaluation on Under-Resourced Languages
    Vidal-Gorene, Chahan
    Dupin, Boris
    Decours-Perez, Alienor
    Riccioli, Thomas
    [J]. DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT III, 2021, 12823 : 507 - 522
  • [10] Modeling under-resourced languages for speech recognition
    Mikko Kurimo
    Seppo Enarvi
    Ottokar Tilk
    Matti Varjokallio
    André Mansikkaniemi
    Tanel Alumäe
    [J]. Language Resources and Evaluation, 2017, 51 : 961 - 987