Unsupervised Approach to Word Sense Disambiguation in Malayalam

被引:4
|
作者
Sankar, Sruthi K. P. [1 ]
Raj, P. C. Reghu [1 ]
Jayan, V [2 ]
机构
[1] Govt Engn Coll, Palakkad, Kerala, India
[2] CDAC, Trivandrum, Kerala, India
关键词
Word sense disambiguation; Unsupervised methods; Information extraction; Collocations; Context similarity;
D O I
10.1016/j.protcy.2016.05.106
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Word Sense Disambiguation (WSD) is the task of identifying the correct sense of a word in a specific context when the word has multiple meaning. WSD is very important as an intermediate step in many Natural Language Processing (NLP) tasks, especially in Information Extraction(IE), Machine Translation(MT) and Question/Answering Systems. Word sense ambiguity arises when a particular word has more than one possible sense. The peculiarity of any language is that it includes a lot of ambiguous words. Since the sense of a word depends on its context of use, disambiguation process requires the understanding of word knowledge. Automatic WSD systems are available for structured languages like English, Chinese, etc. But Indian languages are morphologically rich and thus the processing task is very complex. The aim of this work is to develop a WSD system for Malayalam, a language spoken in India, predominantly used in the state of Kerala. The proposed system uses a corpus which is collected from various Malayalam web documents. For each possible sense of the ambiguous word, a relatively small set of training examples (seed sets) are identified which represents the sense. Collocations and most co-occurring words are considered as training examples. Seed set expansion module extends the seed set by adding most similar words to the seed set elements. These extended sets act as sense clusters. The most similar sense cluster to the input text context is considered as the sense of the target word. (C) 2016 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1507 / 1513
页数:7
相关论文
共 50 条
  • [1] Word Sense Disambiguation in Bengali: an Unsupervised Approach
    Pal, Alok Ranjan
    Saha, Diganta
    [J]. PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [2] MALAYALAM WORD SENSE DISAMBIGUATION USING YAMCHA
    Junaida, M. K.
    Jayan, Jisha P.
    Elizabeth, Sherly
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORK COMMUNICATIONS (COCONET), 2015, : 720 - 724
  • [3] An unsupervised method for word sense disambiguation
    Rahman, Nazreena
    Borah, Bhogeswar
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 6643 - 6651
  • [4] A clustering-based Approach for Unsupervised Word Sense Disambiguation
    Martin-Wanton, Tamara
    Berlanga-Llavori, Rafael
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 49 - 56
  • [5] Unsupervised Word Sense Disambiguation Using Word Embeddings
    Moradi, Behzad
    Ansari, Ebrahim
    Zabokrtsky, Zdenek
    [J]. PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 228 - 233
  • [6] Malayalam Word Sense Disambiguation using Naive Bayes Classifier
    Gopal, Sreelakshmi
    Haroon, Rosna P.
    [J]. 2016 INTERNATIONAL CONFERENCE ON ADVANCES IN HUMAN MACHINE INTERACTION (HMI), 2016, : 83 - 86
  • [7] Mnogoznal: an Unsupervised System for Word Sense Disambiguation
    Ustalov, Dmitry
    Teslenko, Denis
    Panchenko, Alexander
    Chernoskutov, Mikhail
    [J]. 2017 INTERNATIONAL MULTI-CONFERENCE ON ENGINEERING, COMPUTER AND INFORMATION SCIENCES (SIBIRCON), 2017, : 147 - 150
  • [8] Unsupervised Word Sense Disambiguation Using The WWW
    Klapaftis, Ioannis P.
    Manandhar, Suresh
    [J]. STAIRS 2006, 2006, 142 : 174 - 183
  • [9] Unsupervised Word Sense Disambiguation with Multilingual Representations
    Fernandez-Ordonez, Erwin
    Mihalcea, Rada
    Hassan, Samer
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 847 - 851
  • [10] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948