Intrinsic Evaluation of Lithuanian Word Embeddings Using WordNet

被引:4
|
作者
Kapociute-Dzikiene, Jurgita [1 ]
Damasevicius, Robertas [2 ]
机构
[1] Vytautas Magnus Univ, K Donelaicio 58, LT-44248 Kaunas, Lithuania
[2] Kaunas Univ Technol, K Donelaicio 73, LT-44029 Kaunas, Lithuania
关键词
Intrinsic evaluation; Neural word embeddings; The Lithuanian language;
D O I
10.1007/978-3-319-91189-2_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network-based word embeddings -outperforming traditional approaches in the various Natural Language Processing tasks have gained a lot of interest recently. Despite it, the Lithuanian word embeddings have never been obtained and evaluated before. Here we have used the Lithuanian corpus of similar to 234 thousand running words and produced several word embedding models: based on the continuous bagof-words and skip-gram architectures; softmax and negative sampling training algorithms; varied number of dimensions (100, 300, 500, and 1,000). Word embeddings were evaluated using the Lithuanian WordNet as the resource for the synonym search. We have determined the superiority of the continuous bag-of-words over the skip-gram architecture; while the training algorithm and dimensionality showed no significant impact on the results. Better results were achieved with the continuous bag-of-words, negative sampling and 1,000 dimensions.
引用
收藏
页码:394 / 404
页数:11
相关论文
共 50 条
  • [1] Improving WordNet using Word Embeddings
    Chiru, Costin-Gabriel
    Truica, Ciprian-Octavian
    Apostol, Elena-Simona
    Ionescu, Alexandru
    [J]. 2021 23RD INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2021), 2021, : 121 - 128
  • [2] A Fistful of Vectors: A Tool for Intrinsic Evaluation of Word Embeddings
    Ascari, Roberto
    Giabelli, Anna
    Malandri, Lorenzo
    Mercorio, Fabio
    Mezzanzanica, Mario
    [J]. COGNITIVE COMPUTATION, 2024, 16 (03) : 949 - 963
  • [3] Intrinsic and Extrinsic Evaluations of Word Embeddings
    Zhai, Michael
    Tan, Johnny
    Choi, Jinho D.
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 4282 - 4283
  • [4] Training and intrinsic evaluation of lightweight word embeddings for the clinical domain in Spanish
    Chiu, Carolina
    Villena, Fabian
    Martin, Kinan
    Nunez, Fredy
    Besa, Cecilia
    Dunstan, Jocelyn
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [5] WordNet Embeddings
    Saedi, Chakaveh
    Branco, Antonio
    Rodrigues, Joao Antonio
    Silva, Joao Ricardo
    [J]. REPRESENTATION LEARNING FOR NLP, 2018, : 122 - 131
  • [6] Application of WordNet and word embeddings in the development of prototypes for automatic language generation
    Dominguez Vazquez, Maria Jose
    [J]. LINGUAMATICA, 2020, 12 (02): : 71 - 80
  • [7] Evaluation of Croatian Word Embeddings
    Svoboda, Lukas
    Beliga, Slobodan
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1512 - 1518
  • [8] Word Embeddings Evaluation and Combination
    Ghannay, Sahar
    Favre, Benoit
    Esteve, Yannick
    Camelin, Nathalie
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 300 - 305
  • [9] Improving Vietnamese WordNet using word embedding
    Khang Nhut Lam
    Tuan Huynh To
    Thong Tri Tran
    Kalita, Jugal
    [J]. NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 110 - 114
  • [10] Word sense disambiguation using extended WordNet
    Naskar, Sudip Kumar
    Bandyopadhyay, Sivaji
    [J]. ICCTA 2007: INTERNATIONAL CONFERENCE ON COMPUTING: THEORY AND APPLICATIONS, PROCEEDINGS, 2007, : 446 - +