Dimensionality reduction by semantic mapping in text categorization

被引:0
|
作者
Corrêa, RF
Ludermir, TB
机构
[1] Univ Fed Pernambuco, Polytech Sch, BR-50750410 Recife, PE, Brazil
[2] Univ Fed Pernambuco, Ctr Informat, BR-50732970 Recife, PE, Brazil
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In text categorization tasks, the dimensionality reduction become necessary to computation and interpretability of the results generated by machine learning algorithms due to the high-dimensional vector representation of the documents. This paper describes a new feature extraction method called semantic mapping and its application in categorization of web documents. The semantic mapping uses SOM maps to construct variables in reduced space, where each variable describes the behavior of a group of features semantically related. The performance of the semantic mapping is measured and compared empirically with the performance of sparse random mapping and PCA methods and shows to be better than random mapping and a good alternative to PCA.
引用
收藏
页码:1032 / 1037
页数:6
相关论文
共 50 条
  • [1] Aggressive Dimensionality Reduction with Reinforcement Local Feature Selection for Text Categorization
    Zheng, Wenbin
    Qian, Yuntao
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT I, 2010, 6319 : 365 - 372
  • [2] Text Dimensionality Reduction with Mutual Information Preserving Mapping
    Yang Zhen
    Yao Fei
    Fan Kefeng
    Huang Jian
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2017, 26 (05) : 919 - 925
  • [3] Text Dimensionality Reduction with Mutual Information Preserving Mapping
    YANG Zhen
    YAO Fei
    FAN Kefeng
    HUANG Jian
    [J]. Chinese Journal of Electronics, 2017, 26 (05) : 919 - 925
  • [4] AN EMPIRICAL EVALUATION OF DIMENSIONALITY REDUCTION USING LATENT SEMANTIC ANALYSIS ON HINDI TEXT
    Krishnamurthi, Karthik
    Sudi, Ravi Kumar
    Panuganti, Vijayapal Reddy
    Bulusu, Vishnu Vardhan
    [J]. 2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 21 - 24
  • [5] Eliminating high-degree biased character bigrams for dimensionality reduction in Chinese text categorization
    Xue, DJ
    Sun, MS
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 2997 : 197 - 208
  • [6] Dimensionality Reduction with Category Information Fusion and Non-negative Matrix Factorization for Text Categorization
    Zheng, Wenbin
    Qian, Yuntao
    Tang, Hong
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 505 - +
  • [7] EFFICIENT SUPERVISED DIMENSIONALITY REDUCTION FOR IMAGE CATEGORIZATION
    Benmokhtar, Rachid
    Delhumeau, Jonathan
    Gosselin, Philippe-Henri
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 2425 - 2428
  • [8] Multi-label Text Classification Using Semantic Features and Dimensionality Reduction with Autoencoders
    Alkhatib, Wael
    Rensing, Christoph
    Silberbauer, Johannes
    [J]. LANGUAGE, DATA, AND KNOWLEDGE, LDK 2017, 2017, 10318 : 380 - 394
  • [9] On the fusion of threshold classifiers for categorization and dimensionality reduction
    Kestler, Hans A.
    Lausser, Ludwig
    Lindner, Wolfgang
    Palm, Guenther
    [J]. COMPUTATIONAL STATISTICS, 2011, 26 (02) : 321 - 340
  • [10] On the fusion of threshold classifiers for categorization and dimensionality reduction
    Hans A. Kestler
    Ludwig Lausser
    Wolfgang Lindner
    Günther Palm
    [J]. Computational Statistics, 2011, 26 : 321 - 340