A dataset for evaluating Bengali word sense disambiguation techniques

被引:0
|
作者
Das Dawn D. [1 ]
Khan A. [2 ]
Shaikh S.H. [3 ]
Pal R.K. [1 ]
机构
[1] Department of Computer Science and Engineering, University of Calcutta, Calcutta
[2] Product Development and Diversification, ARP Engineering, Calcutta
[3] Department of Computer Science and Engineering, BML Munjal University, Kapriwas
关键词
Bengali; Corpora; Dataset; Indo word dataset; Knowledge resources; Word sense disambiguation;
D O I
10.1007/s12652-022-04471-y
中图分类号
学科分类号
摘要
The computation of natural language enables a suitable transmission through the universe by retrieving the correct sense of each word. A word may be monosemous or polysemous. The use of polysemous words in an appropriate context plays a critical role in communication. Over the last 2 decades, a significant amount of research has been done for automatically solving the correct sense of a polysemous word in the context of word sense disambiguation. A word sense disambiguation algorithm identifies the proper sense of a polysemous word by analysing the contextual data. Nevertheless, there is a gap in the contemporary literature regarding the availability of datasets in Asian languages, especially Bengali. Therefore, in this work, we have presented a dataset comprising hundred Bengali polysemous words. Each word in this dataset consists of three or four disjoint senses, and each sense comprises ten paragraphs. Each paragraph describes the sense of a particular polysemous word. We have performed statistical analysis on the basis of seven relevant and important characteristics. A general framework has also been presented for training and testing with possible guidelines for performance analysis. A baseline strategy has been introduced based on four feature sets. Finally, a set of experiments have been performed to analyse the system performance. © 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:4057 / 4086
页数:29
相关论文
共 50 条
  • [41] Word sense disambiguation for Punjabi language using deep learning techniques
    Singh, Varinder Pal
    Kumar, Parteek
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (08): : 2963 - 2973
  • [42] SensPick: Sense Picking for Word Sense Disambiguation
    Zobaed, Sm
    Haque, Md Enamul
    Rabby, Md Fazle
    Salehi, Mohsen Amini
    [J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 318 - 324
  • [43] Word sense disambiguation for Punjabi language using deep learning techniques
    Varinder pal Singh
    Parteek Kumar
    [J]. Neural Computing and Applications, 2020, 32 : 2963 - 2973
  • [44] Fusion Techniques for Named Entity Recognition and Word Sense Induction and Disambiguation
    Soriano-Morales, Edmundo-Pavel
    Ah-Pine, Julien
    Loudcher, Sabine
    [J]. DISCOVERY SCIENCE, DS 2017, 2017, 10558 : 340 - 355
  • [45] Graph and Word Similarity for Word Sense Disambiguation
    Meng, Fanqing
    [J]. 2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 1114 - 1118
  • [46] Biomedical Word Sense Disambiguation with Word Embeddings
    Antunes, Rui
    Matos, Sergio
    [J]. 11TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS, 2017, 616 : 273 - 279
  • [47] In search of a suitable method for disambiguation of word senses in Bengali
    Alok Ranjan Pal
    Diganta Saha
    Sudip Kumar Naskar
    Niladri Sekhar Dash
    [J]. International Journal of Speech Technology, 2021, 24 : 439 - 454
  • [48] In search of a suitable method for disambiguation of word senses in Bengali
    Pal, Alok Ranjan
    Saha, Diganta
    Naskar, Sudip Kumar
    Dash, Niladri Sekhar
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) : 439 - 454
  • [49] Evaluating n-gram Models for a Bilingual Word Sense Disambiguation Task
    Pinto, David
    Vilarino, Darnes
    Balderas, Carlos
    Tovar, Mireya
    Beltran, Beatriz
    [J]. COMPUTACION Y SISTEMAS, 2011, 15 (02): : 209 - 220
  • [50] Word Sense Indicators: Effective Feature for Chinese Word Sense Disambiguation
    Quan, Changqin
    Ren, Fuji
    He, Tingting
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (05): : 1157 - 1164