WordNet-based lexical semantic classification for text corpus analysis

被引:0
|
作者
Jun Long
Lu-da Wang
Zu-de Li
Zu-ping Zhang
Liu Yang
机构
[1] Central South University,School of Information Science and Engineering
[2] Central South University,School of Software
来源
关键词
document representation; lexical semantic content; classification; eigenvector;
D O I
暂无
中图分类号
学科分类号
摘要
Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, leading to text classification errors. This work proposed a document representation method, WordNet-based lexical semantic VSM, to solve the problem. Using WordNet, this method constructed a data structure of semantic-element information to characterize lexical semantic contents, and adjusted EM modeling to disambiguate word stems. Then, in the lexical-semantic space of corpus, lexical-semantic eigenvector of document representation was built by calculating the weight of each synset, and applied to a widely-recognized algorithm NWKNN. On text corpus Reuter-21578 and its adjusted version of lexical replacement, the experimental results show that the lexical-semantic eigenvector performs F1 measure and scales of dimension better than term-statistic eigenvector based on TF-IDF. Formation of document representation eigenvectors ensures the method a wide prospect of classification applications in text corpus analysis.
引用
收藏
页码:1833 / 1840
页数:7
相关论文
共 50 条
  • [1] WordNet-based lexical semantic classification for text corpus analysis
    Long Jun
    Wang Lu-da
    Li Zu-de
    Zhang Zu-ping
    Yang Liu
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2015, 22 (05) : 1833 - 1840
  • [2] Evaluating WordNet-based measures of lexical semantic relatedness
    Budanitsky, Alexander
    Hirst, Graeme
    [J]. COMPUTATIONAL LINGUISTICS, 2006, 32 (01) : 13 - 47
  • [3] A WordNet-based Semantic Model for Enhancing Text Clustering
    Shehata, Shady
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 477 - 482
  • [4] Word Net-based lexical semantic classification for text corpus analysis
    龙军
    王鲁达
    李祖德
    张祖平
    杨柳
    [J]. Journal of Central South University, 2015, 22 (05) : 1833 - 1840
  • [5] SemanticNet: a WordNet-based Tool for the Navigation of Semantic Information
    Angioni, Manuela
    Demontis, Roberto
    Deriu, Massimo
    Tuveri, Franco
    [J]. GWC 2008: FOURTH GLOBAL WORDNET CONFERENCE, PROCEEDINGS, 2007, : 21 - 34
  • [6] A WordNet-based approach to feature selection in text categorization
    Zhang, K
    Sun, J
    Wang, B
    [J]. INTELLIGENT INFORMATION PROCESSING II, 2005, 163 : 475 - 484
  • [7] Measuring semantic similarity using WordNet-based Context Vectors
    Wan, Shen
    Angryk, Rafal A.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 71 - 76
  • [8] Measuring Word Semantic Relatedness Using WordNet-Based Approach
    Wei, Tingting
    Chang, Huiyou
    [J]. JOURNAL OF COMPUTERS, 2015, 10 (04) : 252 - 259
  • [9] A semantic approach for text clustering using WordNet and lexical chains
    Wei, Tingting
    Lu, Yonghe
    Chang, Huiyou
    Zhou, Qiang
    Bao, Xianyu
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (04) : 2264 - 2275
  • [10] Efficient Hybrid Semantic Text Similarity using Wordnet and a Corpus
    Atoum, Issa
    Otoom, Ahmed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (09) : 124 - 130