Local ensemble learning from imbalanced and noisy data for word sense disambiguation

被引:15
|
作者
Krawczyk, Bartosz [1 ]
McInnes, Bridget T. [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
Machine learning; Natural language processing; Imbalanced classification; Multi-class imbalance; Ensemble learning; One-class classification; Class label noise; Word sense disambiguation; SAMPLING APPROACH; CLASSIFICATION; ALGORITHMS;
D O I
10.1016/j.patcog.2017.10.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing plays a key role in man-machine interactions, allowing computers to understand and analyze human language. One of its more challenging sub-domains is word sense disambiguation, the task of automatically identifying the intended sense (or concept) of an ambiguous word based on the context in which the word is used. This requires proper feature extraction to capture specific data properties and a dedicated machine learning solution to allow for the accurate labeling of the appropriate sense. However, the pattern classification problem posed here is highly challenging, as we must deal with high-dimensional and multi-class imbalanced data that additionally may be corrupted with class label noise. To address these issues, we propose a local ensemble learning solution. It uses a one-class decomposition of the multi-class problem, assigning an ensemble of one-class classifiers to each of the distributions. The classifiers are trained on the basis of low-dimensional subsets of features and a kernel feature space transformation to obtain a more compact representation. Instance weighting is used to filter out potentially noisy instances and reduce overlapping among classes. Finally, a two-level classifier fusion technique is used to reconstruct the original multi-class problem. Our results show that the proposed learning approach displays robustness to both multi-class skewed distributions and class label noise, making it a useful tool for the considered task. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:103 / 119
页数:17
相关论文
共 50 条
  • [1] Word sense disambiguation by learning from unlabeled data
    Park, SB
    Zhang, BT
    Kim, YT
    38TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2000, : 547 - 554
  • [2] Word Sense Disambiguation by Learning Decision Trees from Unlabeled Data
    Seong-Bae Park
    Byoung-Tak Zhang
    Yung Taek Kim
    Applied Intelligence, 2003, 19 : 27 - 38
  • [3] Word sense disambiguation by learning decision trees from unlabeled data
    Park, SB
    Zhang, BT
    Kim, YT
    APPLIED INTELLIGENCE, 2003, 19 (1-2) : 27 - 38
  • [4] Word sense disambiguation based on semi-supervised ensemble learning
    Zhang C.
    Xiong J.
    Gao X.
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2020, 41 (08): : 1216 - 1222
  • [5] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948
  • [6] The Noisy Channel Mode for Unsupervised Word Sense Disambiguation
    Yuret, Deniz
    Yatbaz, Mehmet Ali
    COMPUTATIONAL LINGUISTICS, 2010, 36 (01) : 111 - 127
  • [7] Word sense disambiguation for vocabulary learning
    Kulkarni, Anagha
    Heilman, Michael
    Eskenazi, Maxine
    Callan, Jamie
    INTELLIGENT TUTORING SYSTEM, PROCEEDINGS, 2008, 5091 : 500 - 509
  • [8] Probabilistic Ensemble Fusion for Multimodal Word Sense Disambiguation
    Peng, Yang
    Wang, Daisy Zhe
    Patwa, Ishan
    Gong, Dihong
    Fang, Chunsheng Victor
    2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 172 - 177
  • [9] WORD SENSE DISAMBIGUATION: A STRUCTURED LEARNING PERSPECTIVE
    Zhou, Yun
    Wang, Ting
    Wang, Zhiyuan
    COMPUTING AND INFORMATICS, 2015, 34 (06) : 1257 - 1288
  • [10] Effect of Supervised Sense Disambiguation Model Using Machine Learning Technique and Word Embedding in Word Sense Disambiguation
    Mahajan, Rupesh
    Kokane, Chandrakant
    Pathak, Kishor
    Kodmelwar, Manohar
    Wagh, Kapil
    Bhandari, Mahesh
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (01) : 436 - 443