Text categorization based on combination of modified back propagation neural network and latent semantic analysis

被引:27
|
作者
Wang, Wei [2 ]
Yu, Bo [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian 710049, Peoples R China
[2] Sichuan Univ, Inst Image & Informat, Sch Elect & Informat, Chengdu 610065, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2009年 / 18卷 / 08期
关键词
Text categorization; Latent semantic analysis; Singular value decomposition; Back propagation neural network; Modified back propagation neural network;
D O I
10.1007/s00521-008-0193-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposed a new text categorization model based on the combination of modified back propagation neural network (MBPNN) and latent semantic analysis (LSA). The traditional back propagation neural network (BPNN) has slow training speed and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the MBPNN to accelerate the training speed of BPNN and improve the categorization accuracy. LSA can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimension but also discovers the important associative relationship between terms. We test our categorization model on 20-newsgroup corpus and reuter-21578 corpus, experimental results show that the MBPNN is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.
引用
收藏
页码:875 / 881
页数:7
相关论文
共 50 条
  • [21] Text classification with support vector machine and back propagation neural network
    Zhang, Wen
    Tang, Xijin
    Yoshida, Taketoshi
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 4, PROCEEDINGS, 2007, 4490 : 150 - +
  • [22] Malware Classification Based on the Behavior Analysis and Back Propagation Neural Network
    Pan, Zhi-Peng
    Feng, Chao
    Tang, Chao-Jing
    3RD ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2016), 2016, 7
  • [23] English text quality analysis based on recurrent neural network and semantic segmentation
    Luo, Xiaoyu
    Chen, Zhibin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 112 : 507 - 511
  • [24] Categorization and Monitoring of Internet Public Opinion Based on Latent Semantic Analysis
    Wan, Yuan
    Tong, Hengqing
    ISBIM: 2008 INTERNATIONAL SEMINAR ON BUSINESS AND INFORMATION MANAGEMENT, VOL 2, 2009, : 121 - 124
  • [25] A novel multilingual text categorization system using latent semantic indexing
    Lee, Chung-Hong
    Yang, Hsin-Chang
    Ma, Sheng-Min
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 503 - +
  • [26] Ensemble multi-label text categorization based on rotation forest and latent semantic indexing
    Elghazel, Haytham
    Aussem, Alex
    Gharroudi, Ouadie
    Saadaoui, Wafa
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 57 : 1 - 11
  • [27] Neural network approaches for text document categorization
    Chen, Zhihang
    Ni, Chengwen
    Murphey, Yi L.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1054 - +
  • [28] Latent semantic analysis for text segmentation
    Choi, FYY
    Wiemer-Hastings, P
    Moore, J
    PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 109 - 117
  • [29] lsemantica: A command for text similarity based on latent semantic analysis
    Schwarz, Carlo
    STATA JOURNAL, 2019, 19 (01): : 129 - 142
  • [30] Web Text Classification Based on Improved Latent Semantic Analysis
    Wang, Lan
    Wan, Yuan
    2011 SECOND ETP/IITA CONFERENCE ON TELECOMMUNICATION AND INFORMATION (TEIN 2011), VOL 1, 2011, : 176 - 179