Dimensionality reduction by semantic mapping in text categorization

被引:0
|
作者
Corrêa, RF
Ludermir, TB
机构
[1] Univ Fed Pernambuco, Polytech Sch, BR-50750410 Recife, PE, Brazil
[2] Univ Fed Pernambuco, Ctr Informat, BR-50732970 Recife, PE, Brazil
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In text categorization tasks, the dimensionality reduction become necessary to computation and interpretability of the results generated by machine learning algorithms due to the high-dimensional vector representation of the documents. This paper describes a new feature extraction method called semantic mapping and its application in categorization of web documents. The semantic mapping uses SOM maps to construct variables in reduced space, where each variable describes the behavior of a group of features semantically related. The performance of the semantic mapping is measured and compared empirically with the performance of sparse random mapping and PCA methods and shows to be better than random mapping and a good alternative to PCA.
引用
收藏
页码:1032 / 1037
页数:6
相关论文
共 50 条
  • [41] Reduction techniques for instance based text categorization
    Bednár, P
    Futej, T
    [J]. EMERGING SOLUTIONS FOR FUTURE MANUFACTURING SYSTEMS, 2005, 159 : 475 - 480
  • [42] Feature Reduction Techniques for Arabic Text Categorization
    Duwairi, Rehab
    Al-Refai, Mohammad Nayef
    Khasawneh, Natheer
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (11): : 2347 - 2352
  • [43] Robust discriminant analysis of latent semantic feature for text categorization
    Hu, Jiani
    Deng, Weihong
    Guo, Jun
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 400 - 409
  • [44] Semantic Clustering and Convolutional Neural Network for Short Text Categorization
    Wang, Peng
    Xu, Jiaming
    Xu, Bo
    Liu, Cheng-Lin
    Zhang, Heng
    Wang, Fangyuan
    Hao, Hongwei
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 352 - 357
  • [45] Semantic Vector Space Model for Reducing Arabic Text Dimensionality
    Awajun, Arafat
    [J]. 2015 FIFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND ITS APPLICATIONS (DICTAP), 2015, : 129 - 135
  • [46] A semantic case-based reasoning framework for text categorization
    Ceausu, Valentina
    Despres, Sylvie
    [J]. SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 736 - +
  • [47] Non-negative Sparse Semantic Coding for Text Categorization
    Zheng, Wenbin
    Qian, Yuntao
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 409 - 412
  • [48] Latent semantic analysis for text categorization using neural network
    Yu, Bo
    Xu, Zong-ben
    Li, Cheng-hua
    [J]. KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 900 - 904
  • [49] Measurement of Turkish Word Semantic Similarity and Text Categorization Application
    Amasyah, M. Fatih
    Beken, Aytunc
    [J]. 2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 1 - 4
  • [50] Support Vector Machines based on a semantic kernel for text categorization
    Siolas, G
    d'Alché-Buc, F
    [J]. IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL V, 2000, : 205 - 209