Abstracting for Dimensionality Reduction in Text Classification

被引:1
|
作者
McAllister, Richard A. [1 ]
Angryk, Rafal A. [1 ]
机构
[1] Montana State Univ, Dept Comp Sci, Bozeman, MT 59717 USA
关键词
D O I
10.1002/int.21543
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is a growing interest in efficient models of text mining and an emergent need for new data structures that address word relationships. Detailed knowledge about the taxonomic environment of keywords that are used in text documents can provide valuable insight into the nature of the subject matter contained therein. Such insight may be used to enhance the data structures used in the text data mining task as relationships become usefully apparent. A popular scalable technique used to infer these relationships, while reducing dimensionality, has been Latent Semantic Analysis. We present a new approach, which uses an ontology of lexical abstractions to create abstraction profiles of documents and uses these profiles to perform text organization based on a process that we call frequent abstraction analysis. We introduce TATOO, the Text Abstraction TOOlkit, which is a full implementation of this new approach. We present our data model via an example of how taxonomically derived abstractions can be used to supplement semantic data structures for the text classification task. (C) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:115 / 138
页数:24
相关论文
共 50 条
  • [31] Dimensionality Reduction Approach for High Dimensional Text Documents
    Reddy, G. Suresh
    2016 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2016,
  • [32] Text Dimensionality Reduction with Mutual Information Preserving Mapping
    Yang Zhen
    Yao Fei
    Fan Kefeng
    Huang Jian
    CHINESE JOURNAL OF ELECTRONICS, 2017, 26 (05) : 919 - 925
  • [33] Text Dimensionality Reduction with Mutual Information Preserving Mapping
    YANG Zhen
    YAO Fei
    FAN Kefeng
    HUANG Jian
    ChineseJournalofElectronics, 2017, 26 (05) : 919 - 925
  • [34] SDRS: A new lossless dimensionality reduction for text corpora
    Velez de Mendizabal, Inaki
    Basto-Fernandes, Vitor
    Ezpeleta, Enaitz
    Mendez, Jose R.
    Zurutuza, Urko
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)
  • [35] Knowledge Based Dimensionality Reduction for Technical Text Mining
    Shalaby, Walid
    Zadrozny, Wlodek
    Gallagher, Sean
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [36] Dimensionality Reduction in Multilabel Classification with Neural Networks
    Mandziuk, Jacek
    Zychowski, Adam
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [37] SVM-induced Dimensionality Reduction and Classification
    Yang, Bo
    ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL IV, PROCEEDINGS, 2009, : 275 - 278
  • [38] Approaches of Dimensionality Reduction for Telugu Document Classification
    Reddy, P. Vijayapal
    Sasidhar, B.
    Reddy, B. Harinatha
    Vardhan, B. Vishnu
    Reddy, L. Pratap
    Govardhan, A.
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 259 - 264
  • [39] Quadratic mutual information for dimensionality reduction and classification
    Gray, David M.
    Principe, Jose C.
    AUTOMATIC TARGET RECOGNITION XX; ACQUISITION, TRACKING, POINTING, AND LASER SYSTEMS TECHNOLOGIES XXIV; AND OPTICAL PATTERN RECOGNITION XXI, 2010, 7696
  • [40] PCA Dimensionality Reduction Method for Image Classification
    Zhao, Baiting
    Dong, Xiao
    Guo, Yongcun
    Jia, Xiaofen
    Huang, Yourui
    NEURAL PROCESSING LETTERS, 2022, 54 (01) : 347 - 368