A metaheuristic with a neural surrogate function for Word Sense Disambiguation

被引:1
|
作者
Nodehi, Azim Keshavarzian [1 ]
Charkari, Nasrollah Moghadam [1 ]
机构
[1] Tarbiat Modares Univ, Tehran, Iran
来源
关键词
Word Sense Disambiguation; Metaheuristics; Surrogate Functions; Sense Mapping;
D O I
10.1016/j.mlwa.2022.100369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Sense Disambiguation (WSD) is one of the earliest problems in natural language processing which aims to determine the correct sense of words in context. The semantic information provided by WSD systems is highly beneficial to many tasks such as machine translation, information extraction, and semantic parsing. In this work, a new approach for WSD is proposed which uses a neural network as a surrogate fitness function in a metaheuristic algorithm. Also, a new method for simultaneous training of word and sense embeddings is proposed in this work. Accordingly, the node2vec algorithm is employed on the WordNet graph to generate sequences containing both words and senses. These sequences are then used along with paragraphs from Wikipedia in the word2vec algorithm to generate embeddings for words and senses at the same time. In order to address data imbalance in this task, sense probability distribution data extracted from the training corpus is used in the search process of the proposed simulated annealing algorithm. Furthermore, we introduce a new approach for clustering and mapping senses in the WordNet graph, which considerably improves the accuracy of the proposed method. In this approach, nodes in the WordNet graph are clustered on the condition that no two senses of the same word be present in one cluster. Then, repeatedly, all nodes in each cluster are mapped to a randomly selected node from that cluster, meaning that the representative node can take advantage of the training instances of all the other nodes in the cluster. Training the proposed method in this work is done using the SemCor dataset and the SemEval-2015 dataset has been used as the validation set. The final evaluation of the system is performed on SensEval-2, SensEval-3, SemEval-2007, SemEval-2013, SemEval-2015, and the concatenation of all five mentioned datasets. The performance of the system is also evaluated on the four content word categories, namely, nouns, verbs, adjectives, and adverbs. Experimental results show that the proposed method achieves accuracies in the range of 74.8 to 84.6 percent in the ten aforementioned evaluation categories which are close to and in some cases better than the state of the art in this task.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Research on Word Sense Disambiguation
    Zhan, Jingwen
    Chen, Yanmin
    ADVANCED MATERIALS SCIENCE AND TECHNOLOGY, PTS 1-2, 2011, 181-182 : 337 - 342
  • [22] Word Sense Disambiguation: An Overview
    McCarthy, Diana
    LANGUAGE AND LINGUISTICS COMPASS, 2009, 3 (02): : 537 - 558
  • [23] Trends in word sense disambiguation
    R. V. Vidhu Bhala
    S. Abirami
    Artificial Intelligence Review, 2014, 42 : 159 - 171
  • [24] Word sense disambiguation with pictures
    Barnard, K
    Johnson, M
    ARTIFICIAL INTELLIGENCE, 2005, 167 (1-2) : 13 - 30
  • [25] Word Sense Disambiguation for Assamese
    Sarmah, Jumi
    Sarma, Shikhar Kr
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 146 - 151
  • [26] Soft Word Sense Disambiguation
    Ramakrishnan, Ganesh
    Prithviraj, B. P.
    Deepa, A.
    Bhattacharyya, Pushpak
    Chakrabarti, Soumen
    GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 291 - 298
  • [27] Word Sense Disambiguation for Turkish
    Mert, Ezgi
    Dalkilic, Goekhan
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 205 - 210
  • [28] Smoothing and Word Sense Disambiguation
    Agirre, E
    Martinez, D
    ADVANCES IN NATURAL LANGUAGE PROCESSING, 2004, 3230 : 360 - 371
  • [29] SensPick: Sense Picking for Word Sense Disambiguation
    Zobaed, Sm
    Haque, Md Enamul
    Rabby, Md Fazle
    Salehi, Mohsen Amini
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 318 - 324
  • [30] Graph and Word Similarity for Word Sense Disambiguation
    Meng, Fanqing
    2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 1114 - 1118