Word Sense Disambiguation in Bengali: an Unsupervised Approach

被引:0
|
作者
Pal, Alok Ranjan [1 ]
Saha, Diganta [2 ]
机构
[1] Coll Engn & Mgmt, Kolaghat, India
[2] Jadavpur Univ, Kolkata, India
关键词
Natural Language Processing; Sentence Clustering; Type-based Method; Token-based method; Word Sense Disambiguation;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the proposed approach, Word Sense Disambiguation (WSD) in Bengali language has been done using unsupervised methodology. This work is consisted of sequential two sub-tasks. First one is grouping of Bengali sentences into a certain number of clusters where a particular cluster contains the sentences of similar meaning and second one is labeling the clusters with its inner meanings with the help of a linguistic expert as these sense tagged clusters could be used as a knowledge reference for WSD task. In this work, clustering has been performed using weka-3-6-13 tool. The test sentences are collected from the Bengali text corpus developed in the TDIL (Technology Development for Indian Language) project of the Govt. of India. In this work, Type-based and Token-based distributional approaches have been developed for Bengali sentence clustering. In Type-based method, a feature vector of co-occurring words of a target word in a sentence has been considered and in Token-based method, synsets of the collocating words are also considered. The synsets of the collocating words are retrieved from the Bengali WordNet, developed at ISI, Kolkata. The base line result, achieved result and the pitfalls of the procedure are discussed in the report in detail.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications
    Alok Ranjan Pal
    Diganta Saha
    [J]. Sādhanā, 2019, 44
  • [2] Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications
    Pal, Alok Ranjan
    Saha, Diganta
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (07):
  • [3] Unsupervised Approach to Word Sense Disambiguation in Malayalam
    Sankar, Sruthi K. P.
    Raj, P. C. Reghu
    Jayan, V
    [J]. INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY (ICETEST - 2015), 2016, 24 : 1507 - 1513
  • [4] Word Sense Disambiguation in Bengali: a Knowledge based Approach using Bengali WordNet
    Pal, Alok Ranjan
    Saha, Diganta
    Naskar, Sudip Kumar
    [J]. PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [5] A comprehensive review of Bengali word sense disambiguation
    Debapratim Das Dawn
    Soharab Hossain Shaikh
    Rajat Kumar Pal
    [J]. Artificial Intelligence Review, 2020, 53 : 4183 - 4213
  • [6] A comprehensive review of Bengali word sense disambiguation
    Das Dawn, Debapratim
    Shaikh, Soharab Hossain
    Pal, Rajat Kumar
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (06) : 4183 - 4213
  • [7] An unsupervised method for word sense disambiguation
    Rahman, Nazreena
    Borah, Bhogeswar
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 6643 - 6651
  • [8] A clustering-based Approach for Unsupervised Word Sense Disambiguation
    Martin-Wanton, Tamara
    Berlanga-Llavori, Rafael
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 49 - 56
  • [9] A dataset for evaluating Bengali word sense disambiguation techniques
    Das Dawn D.
    Khan A.
    Shaikh S.H.
    Pal R.K.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (4) : 4057 - 4086
  • [10] Modified lesk algorithm for word sense disambiguation in Bengali
    Das, Ratul
    Pal, Alok Ranjan
    Saha, Diganta
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2024, 49 (02):