Unsupervised word sense disambiguation with N-gram features

被引:10
|
作者
Preotiuc-Pietro, Daniel [1 ]
Hristea, Florentina [2 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Bucharest, Dept Comp Sci, Bucharest 010014, Romania
关键词
Bayesian classification; The EM algorithm; Word sense disambiguation; Unsupervised disambiguation; Web-scale N-grams;
D O I
10.1007/s10462-011-9306-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The present paper concentrates on the issue of feature selection for unsupervised word sense disambiguation (WSD) performed with an underlying Na < ve Bayes model. It introduces web N-gram features which, to our knowledge, are used for the first time in unsupervised WSD. While creating features from unlabeled data, we are "helping" a simple, basic knowledge-lean disambiguation algorithm to significantly increase its accuracy as a result of receiving easily obtainable knowledge. The performance of this method is compared to that of others that rely on completely different feature sets. Test results concerning nouns, adjectives and verbs show that web N-gram feature selection is a reliable alternative to previously existing approaches, provided that a "quality list" of features, adapted to the part of speech, is used.
引用
收藏
页码:241 / 260
页数:20
相关论文
共 50 条
  • [1] Unsupervised word sense disambiguation with N-gram features
    Daniel Preotiuc-Pietro
    Florentina Hristea
    [J]. Artificial Intelligence Review, 2014, 41 : 241 - 260
  • [2] Evaluating n-gram Models for a Bilingual Word Sense Disambiguation Task
    Pinto, David
    Vilarino, Darnes
    Balderas, Carlos
    Tovar, Mireya
    Beltran, Beatriz
    [J]. COMPUTACION Y SISTEMAS, 2011, 15 (02): : 209 - 220
  • [3] An unsupervised method for word sense disambiguation
    Rahman, Nazreena
    Borah, Bhogeswar
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 6643 - 6651
  • [4] Unsupervised Word Sense Disambiguation Using Word Embeddings
    Moradi, Behzad
    Ansari, Ebrahim
    Zabokrtsky, Zdenek
    [J]. PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 228 - 233
  • [5] Unsupervised Word Sense Disambiguation with Multilingual Representations
    Fernandez-Ordonez, Erwin
    Mihalcea, Rada
    Hassan, Samer
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 847 - 851
  • [6] Unsupervised Approach to Word Sense Disambiguation in Malayalam
    Sankar, Sruthi K. P.
    Raj, P. C. Reghu
    Jayan, V
    [J]. INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY (ICETEST - 2015), 2016, 24 : 1507 - 1513
  • [7] Mnogoznal: an Unsupervised System for Word Sense Disambiguation
    Ustalov, Dmitry
    Teslenko, Denis
    Panchenko, Alexander
    Chernoskutov, Mikhail
    [J]. 2017 INTERNATIONAL MULTI-CONFERENCE ON ENGINEERING, COMPUTER AND INFORMATION SCIENCES (SIBIRCON), 2017, : 147 - 150
  • [8] Unsupervised Word Sense Disambiguation Using The WWW
    Klapaftis, Ioannis P.
    Manandhar, Suresh
    [J]. STAIRS 2006, 2006, 142 : 174 - 183
  • [9] Word Sense Disambiguation in Bengali: an Unsupervised Approach
    Pal, Alok Ranjan
    Saha, Diganta
    [J]. PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [10] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948