WordNet-based lexical semantic classification for text corpus analysis

被引:0
|
作者
Jun Long
Lu-da Wang
Zu-de Li
Zu-ping Zhang
Liu Yang
机构
[1] Central South University,School of Information Science and Engineering
[2] Central South University,School of Software
来源
关键词
document representation; lexical semantic content; classification; eigenvector;
D O I
暂无
中图分类号
学科分类号
摘要
Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, leading to text classification errors. This work proposed a document representation method, WordNet-based lexical semantic VSM, to solve the problem. Using WordNet, this method constructed a data structure of semantic-element information to characterize lexical semantic contents, and adjusted EM modeling to disambiguate word stems. Then, in the lexical-semantic space of corpus, lexical-semantic eigenvector of document representation was built by calculating the weight of each synset, and applied to a widely-recognized algorithm NWKNN. On text corpus Reuter-21578 and its adjusted version of lexical replacement, the experimental results show that the lexical-semantic eigenvector performs F1 measure and scales of dimension better than term-statistic eigenvector based on TF-IDF. Formation of document representation eigenvectors ensures the method a wide prospect of classification applications in text corpus analysis.
引用
收藏
页码:1833 / 1840
页数:7
相关论文
共 50 条
  • [11] A Semantic Oriented Approach to Textual Entailment Using WordNet-Based Measures
    Castillo, Julio J.
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, MICAI 2010, PT I, 2010, 6437 : 44 - 55
  • [12] WordNet-based document summarization
    Dang, Chenghua
    Luo, Xinjun
    [J]. WSEAS: ADVANCES ON APPLIED COMPUTER AND APPLIED COMPUTATIONAL SCIENCE, 2008, : 383 - +
  • [13] Academic text classification based on lexical-semantic content
    Venegas, Rene
    [J]. REVISTA SIGNOS, 2007, 40 (63): : 239 - 271
  • [14] Classification of Semantic Documents based on WordNet
    Shi, Bin
    Fang, Liying
    Yan, Jianzhuo
    Wang, Pu
    Dong, Chen
    [J]. IEEE: 2009 INTERNATIONAL CONFERENCE ON E-LEARNING, E-BUSINESS, ENTERPRISE INFORMATION SYSTEMS AND E-GOVERNMENT, 2009, : 173 - 176
  • [15] Identifying Text Reuse Using WordNet-based Extended Named Entity Recognition
    Lee, Eunji
    Kim, Pankoo
    [J]. PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 199 - 202
  • [16] A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment
    Javier Castillo, Julio
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2011, 2 (03) : 177 - 189
  • [17] A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment
    Julio Javier Castillo
    [J]. International Journal of Machine Learning and Cybernetics, 2011, 2 : 177 - 189
  • [18] Data Expansion Using WordNet-based Semantic Expansion and Word Disambiguation for Cyberbullying Detection
    Jahan, Md Saroar
    Beddiar, Djamila Romaissa
    Oussalah, Mourad
    Mohamed, Muhidin
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1761 - 1770
  • [19] word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis
    Jimenez, Sergio
    Gonzalez, Fabio A.
    Gelbukh, Alexander
    Duenas, George
    [J]. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (02) : 41 - 53
  • [20] Building Semantic Corpus from WordNet
    Stanchev, Lubomir
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,