Word2vec's Distributed Word Representation for Hindi Word Sense Disambiguation

被引:4
|
作者
Kumari, Archana [1 ]
Lobiyal, D. K. [1 ]
机构
[1] Jawaharlal Nehru Univ, New Delhi, India
关键词
Natural Language Processing; Hindi Word Sense Disambiguation; Word2vec; Word embedding;
D O I
10.1007/978-3-030-36987-3_21
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Word Sense Disambiguation (WSD) is the task of extracting an appropriate sense of an ambiguous word in a sentence. WSD is an essential task for language processing, as it is a pre-requisite for determining the closest interpretations of various language-based applications. In this paper, we have made an attempt to exploit the word embedding for finding the solution for WSD for the Hindi texts. This task involves two steps - the creation of word embedding and leveraging cosine similarity to identify an appropriate sense of the word. In this process, we have considered two mostly used word2vec architectures known as Skip-Gram and Continuous Bag-Of-Words [2] models to develop the word embedding. Further, we have chosen the sense with the closest proximity to identify the meaning of an ambiguous word. To prove the effectiveness of the proposed model, we have performed experiments on large corpora and have achieved an accuracy of nearly 52%.
引用
收藏
页码:325 / 335
页数:11
相关论文
共 50 条
  • [1] Word2vec for Arabic Word Sense Disambiguation
    Laatar, Rim
    Aloulou, Chafik
    Belghuith, Lamia Hadrich
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2018), 2018, 10859 : 308 - 311
  • [2] Word Sense Disambiguation Using Cosine Similarity Collaborates with Word2vec and WordNet
    Orkphol, Korawit
    Yang, Wu
    [J]. FUTURE INTERNET, 2019, 11 (05):
  • [3] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
    Tang Huanling
    Zhu Hui
    Wei Hongmin
    Zheng Han
    Mao Xueli
    Lu Mingyu
    Guo Jin
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 647 - 654
  • [4] Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
    TANG Huanling
    ZHU Hui
    WEI Hongmin
    ZHENG Han
    MAO Xueli
    LU Mingyu
    GUO Jin
    [J]. Chinese Journal of Electronics, 2023, 32 (03) : 647 - 654
  • [5] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948
  • [6] An Innovative Method for Hindi Word Sense Disambiguation
    Mishra B.K.
    Jain S.
    [J]. SN Computer Science, 4 (6)
  • [7] Stability of Word Embeddings Using Word2Vec
    Chugh, Mansi
    Whigham, Peter A.
    Dick, Grant
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 812 - 818
  • [8] Parallelizing Word2Vec in Shared and Distributed Memory
    Ji, Shihao
    Satish, Nadathur
    Li, Sheng
    Dubey, Pradeep K.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (09) : 2090 - 2100
  • [9] Improving Word Representation by Tuning Word2Vec Parameters with Deep Learning Model
    Tezgider, Murat
    Yildiz, Beytullah
    Aydin, Galip
    [J]. 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [10] Modelling of Topic from Hindi Corpus using Word2Vec
    Panigrahi, Sabitra Sankalp
    Panigrahi, Narayan
    Paul, Biswajit
    [J]. 2018 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, CONTROL AND COMMUNICATION TECHNOLOGY (IAC3T), 2018, : 97 - 100