Automatic Text Categorization using NTC

被引:0
|
作者
Jo, Taeho [1 ]
机构
[1] Inha Univ, Sch Comp & Informat Engn, Inchon, South Korea
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this research, we propose NTC (Neural Text Categorizer) as the approach to text categorization. Traditional approaches to text categorization require encoding documents into numerical vectors which leads to the two main problems: huge dimensionality and sparse distribution in each numerical vector. In this research, documents are encoded into string vectors instead of numerical vectors, and a new neural network called NTC which receive a string vector as its input vector is used for text categorization. The goal of this research is to avoid the two main problems by encoding documents into alternative structured data to numerical vectors. We will validate the performance of NTC by comparing it with other machine learning algorithms on the standard test bed, Reuter 21578.
引用
收藏
页码:26 / 31
页数:6
相关论文
共 50 条
  • [1] Using kNN model for automatic text categorization
    Guo, GD
    Wang, H
    Bell, D
    Bi, YX
    Greer, K
    [J]. SOFT COMPUTING, 2006, 10 (05) : 423 - 430
  • [2] Automatic Assamese Text Categorization Using WordNet
    Sarmah, Jumi
    Barman, Anup Kumar
    Sarma, Shikhar Kr.
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 85 - 89
  • [3] Automatic text categorization using neural networks
    Ruiz, ME
    Srinivasan, P
    [J]. ADVANCES IN CLASSIFICATION RESEARCH, VOL 8, 1998, : 59 - 72
  • [4] Using kNN model for automatic text categorization
    Gongde Guo
    Hui Wang
    David Bell
    Yaxin Bi
    Kieran Greer
    [J]. Soft Computing, 2006, 10 : 423 - 430
  • [5] Automatic Arabic Text Categorization using Bayesian Learning
    Kadhim, Mahmood H.
    Omar, Nazlia
    [J]. 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 415 - 419
  • [6] Automatic categorization of fanatic text using random forests
    Klema, Jiri
    Almonayyes, Ahmad
    [J]. KUWAIT JOURNAL OF SCIENCE & ENGINEERING, 2006, 33 (02): : 1 - 18
  • [7] Automatic learning features using bootstrapping for text categorization
    Chen, WL
    Zhu, JB
    Wu, HL
    Yao, TS
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 571 - 579
  • [8] Improving the Performance of Text Categorization using Automatic Summarization
    Jiang Xiao-Yu
    Fan Xiao-Zhong
    Wang Zhi-Fei
    Jia Ke-Liang
    [J]. 2009 INTERNATIONAL CONFERENCE ON COMPUTER MODELING AND SIMULATION, PROCEEDINGS, 2009, : 347 - +
  • [9] Automatic Text Categorization of Marathi Documents Using Clustering Technique
    Vispute, Sushma R.
    Potey, M. A.
    [J]. 2013 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING TECHNOLOGIES (ICACT), 2013,
  • [10] Automatic word clustering for text categorization using global information
    Chen, WL
    Chang, XZ
    Wang, HH
    Zhu, JB
    Yao, TS
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, 2005, 3411 : 1 - 11