Evaluation of Dictionary Creating Methods for Under-Resourced Languages

被引:1
|
作者
Simon, Eszter [1 ]
Mittelholcz, Ivan [1 ]
机构
[1] Hungarian Acad Sci, Res Inst Linguist, Benczur U 33, H-1068 Budapest, Hungary
来源
基金
匈牙利科学研究基金会;
关键词
Bilingual dictionaries; Evaluation; Under-resourced languages; Dictionary building methods;
D O I
10.1007/978-3-319-64206-2_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present several bilingual dictionary building methods applied for Northern Saami-{English, Finnish, Hungarian, Russian} language pairs. Since Northern Saami is an under-resourced language and standard dictionary building methods require a large amount of pre-processed data, we had to find alternative methods. In a thorough evaluation, we compared the results for each method, which proved our expectations that the precision of standard lexicon building methods is quite low. The most precise method is utilizing Wikipedia title pairs extracted via inter-language links, but Wiktionary-based methods also provided useful result.
引用
收藏
页码:246 / 254
页数:9
相关论文
共 50 条
  • [31] An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
    Ustalov, Dmitry
    Teslenko, Denis
    Panchenko, Alexander
    Chernoskutov, Mikhail
    Biemann, Chris
    Ponzetto, Simone Paolo
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1018 - 1022
  • [32] Speech recognition of under-resourced languages using mismatched transcriptions
    Do, Van Hai
    Chen, Nancy F.
    Lim, Boon Pang
    Hasegawa-Johnson, Mark
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 112 - 115
  • [33] SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian
    Cucu, Horia
    Buzo, Andi
    Besacier, Laurent
    Burileanu, Corneliu
    [J]. SPEECH COMMUNICATION, 2014, 56 : 195 - 212
  • [34] Engineering for the Under-Resourced
    Tsai, Nancey Trevanian
    [J]. IEEE PULSE, 2023, 14 (06) : 54 - 55
  • [35] POSTGRADS UNDER-RESOURCED
    MASLEN, G
    [J]. SEARCH, 1992, 23 (06): : 192 - 192
  • [36] Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
    Baumann, Peter
    Pierrehumbert, Janet
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3355 - 3359
  • [37] YAST : A scalable ASR toolkit especially designed for under-resourced languages
    Ferreira, Emmanuel
    Nocera, Pascal
    Goudi, Maria
    Ngoc Diep Do Thi
    [J]. 2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 141 - 144
  • [38] Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages
    Do, Van Hai
    Chen, Nancy F.
    Lim, Boon Pang
    Hasegawa-Johnson, Mark
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3863 - 3867
  • [39] A Review on Speech Recognition for Under-Resourced Languages: A Case Study of Vietnamese
    Phung, Trung-Nghia
    Nguyen, Duc-Binh
    Pham, Ngoc-Phuong
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE AND SYSTEMS SCIENCE, 2024, 15 (01)
  • [40] Multilingual Sentiment Analysis for Under-Resourced Languages: A Systematic Review of the Landscape
    Mabokela, Koena Ronny
    Celik, Turgay
    Raborife, Mpho
    [J]. IEEE ACCESS, 2023, 11 : 15996 - 16020