An unsupervised method for weighting finite-state morphological analyzers

被引:0
|
作者
Keleg, Amr [1 ]
Tyers, Francis M.
Howell, Nicholas
Pirinen, Tommi A.
机构
[1] Hamburger Zentrum Sprachkorpora, Fac Engn, Dept Linguist, Sch Linguist, Hamburg, Germany
关键词
FSTs; FST weighting; constraint grammar; word2vec;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Morphological analysis is one of the tasks that have been studied for years. Different techniques have been used to develop models for performing morphological analysis. Models based on finite state transducers have proved to be more suitable for languages with low available resources. In this paper, we have developed a method for weighting a morphological analyzer built using finite state transducers in order to disambiguate its results. The method is based on a word2vec model that is trained in a completely unsupervised way using raw untagged corpora and is able to capture the semantic meaning of the words. Most of the methods used for disambiguating the results of a morphological analyzer relied on having tagged corpora that need to manually built. Additionally, the method developed uses information about the token irrespective of its context unlike most of the other techniques that heavily rely on the word's context to disambiguate its set of candidate analyses.
引用
收藏
页码:3842 / 3850
页数:9
相关论文
共 50 条
  • [1] A Finite-State Morphological Analyser for Tuvan
    Tyers, Francis M.
    Washington, Jonathan North
    Bayyr-ool, Aziyana
    Salchak, Aelita
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2562 - 2567
  • [2] A Finite-State Morphological Analyser for Evenki
    Zueva, Anna
    Kuznetsova, Anastasia
    Tyers, Francis M.
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2581 - 2589
  • [3] Finite-state morphological analysis for Gagauz
    Bayatli, Sevilay
    Karanfil, Gullu
    Gokirmak, Memduh
    Tyers, Francis M.
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2588 - 2592
  • [4] A Finite-State Morphological Analyser for Sindhi
    Motlani, Raveesh
    Tyers, Francis M.
    Sharma, Dipti M.
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2572 - 2577
  • [5] A finite-state morphological transducer for Kyrgyz
    Washington, Jonathan North
    Ipasov, Mirlan
    Tyers, Francis M.
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 934 - 940
  • [6] A finite-state morphological analysis of Tswana nouns
    Pretorius, Rigardt
    Viljoen, Biffie
    Pretorius, Laurette
    SOUTH AFRICAN JOURNAL OF AFRICAN LANGUAGES, 2005, 25 (01) : 48 - 58
  • [7] A Morphological Analyzer For Wolof Using Finite-State Techniques
    Dione, Cheikh M. Bamba
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 894 - 901
  • [8] Finite-state morphological transducers for three Kypchak languages
    Washington, Jonathan North
    Salimzyanov, Ilnar
    Tyers, Francis M.
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3378 - 3385
  • [9] A finite-state morphological analyzer for a Lakota precision grammar
    Curtis, Christian M.
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 541 - 544
  • [10] A Compiler for Morphological Analyzer Based on Finite-State Transducers
    Melinamath, Bhuvaneshwari C.
    Math, A. G.
    Biradar, Sunanda D.
    INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 81 - 85