An approach to text classification using dimensionality reduction and combination of classifiers

被引:10
|
作者
Jain, G [1 ]
Ginwala, A [1 ]
Aslandogan, YA [1 ]
机构
[1] Univ Texas, Dept Comp Sci & Engn, Arlington, TX 76019 USA
关键词
D O I
10.1109/IRI.2004.1431521
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification involves assignment of predetermined categories to textual resources. Applications of text classification include recommendation systems, personalization, help desk automation, content filtering and routing, selective alerting, and text mining. This paper describes an experiment for improving the classification accuracy of a large text corpus by the use of dimensionality reduction and multiple-classifier combination techniques. Three different classifiers have been used namely Naive Bayes, J48 Decision Tree and Decision Table. The results of these classifiers are combined using techniques such as Simple Voting, Weighted Voting and Probability-based Voting. The classification accuracy is further improved by the use of a dimensionality reduction method based on concept indexing. Experiments conducted on the Reuters 21578 dataset indicate that the combination approach provides an improved and scalable method for text classification. Also, it is observed that concept indexing helps with classification accuracy in addition to efficiency and scalability.
引用
收藏
页码:564 / 569
页数:6
相关论文
共 50 条
  • [1] A Comparative Approach of Dimensionality Reduction Techniques in Text Classification
    Basha, Shaik Rahamat
    Rani, J. Keziya
    [J]. ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2019, 9 (06) : 4974 - 4979
  • [2] Dimensionality Reduction for Sentiment Classification using Machine Learning Classifiers
    Islam, Mazharul
    Anjum, Aftab
    Ahsan, Tanveer
    Wang, Lin
    [J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 3097 - 3103
  • [3] Dimensionality reduction in text classification using scatter method
    Saarikoski, Jyri
    Laurikkala, Jorma
    Jarvelin, Kalervo
    Siermala, Markku
    Juhola, Martti
    [J]. INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (01) : 1 - 21
  • [4] Abstracting for Dimensionality Reduction in Text Classification
    McAllister, Richard A.
    Angryk, Rafal A.
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2013, 28 (02) : 115 - 138
  • [5] An Efficient Approach for Dimensionality Reduction and Classification of High Dimensional Text Documents
    Kumar, Kotte Vinay
    Srinivasan, R.
    Singh, E. B.
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON DATA SCIENCE, E-LEARNING AND INFORMATION SYSTEMS 2018 (DATA'18), 2018,
  • [6] Taxonomic Dimensionality Reduction in Bayesian Text Classification
    McAllister, Richard
    Sheppard, John
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 508 - 513
  • [7] Dimensionality Reduction by Mutual Information for Text Classification
    刘丽珍
    宋瀚涛
    陆玉昌
    [J]. Journal of Beijing Institute of Technology, 2005, (01) : 32 - 36
  • [8] Using Discriminative Dimensionality Reduction to Visualize Classifiers
    Alexander Schulz
    Andrej Gisbrecht
    Barbara Hammer
    [J]. Neural Processing Letters, 2015, 42 : 27 - 54
  • [9] Using Nonlinear Dimensionality Reduction to Visualize Classifiers
    Schulz, Alexander
    Gisbrecht, Andrej
    Hammer, Barbara
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT I, 2013, 7902 : 59 - 68
  • [10] Using Discriminative Dimensionality Reduction to Visualize Classifiers
    Schulz, Alexander
    Gisbrecht, Andrej
    Hammer, Barbara
    [J]. NEURAL PROCESSING LETTERS, 2015, 42 (01) : 27 - 54