Automatic document classification based on latent semantic analysis

被引:2
|
作者
Kuralenok, I [1 ]
Nekrest'yanov, I [1 ]
机构
[1] St Petersburg State Univ, St Petersburg 199164, Russia
关键词
Latent Semantic Analysis; Latent Semantic Indexing; Hypothesis Space; Topic Description; Semantic Proximity;
D O I
10.1007/BF02759469
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of documents is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 50 条
  • [21] Automatic document classification based on probabilistic reasoning: Model and performance analysis
    Lam, W
    Low, KF
    [J]. SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 2719 - 2723
  • [22] Automatic text summarization based on latent semantic indexing
    Ai, Dongmei
    Zheng, Yuchao
    Zhang, Dezheng
    [J]. ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (01) : 25 - 29
  • [23] An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization
    Kamal Al-Sabahi
    Zuping Zhang
    Jun Long
    Khaled Alwesabi
    [J]. Arabian Journal for Science and Engineering, 2018, 43 : 8079 - 8094
  • [24] An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization
    Al-Sabahi, Kamal
    Zhang, Zuping
    Long, Jun
    Alwesabi, Khaled
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (12) : 8079 - 8094
  • [25] A Semantic Based Approach for Automatic Patent Document Summarization
    Trappey, Amy J. C.
    Trappey, Charles V.
    Wu, Chun-Yi
    [J]. COLLABORATIVE PRODUCTIVE AND SERVICE LIFE CYCLE MANAGEMENT FOR A SUSTAINABLE WORLD, 2008, : 485 - +
  • [26] Multi-level text classification method based on latent semantic analysis
    Shi, Hongxia
    Wei, Guiyi
    Pan, Yun
    [J]. ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: SOFTWARE AGENTS AND INTERNET COMPUTING, 2007, : 320 - +
  • [27] Automatic Object Classification through Semantic Analysis
    Li, Xiaokun
    Zhu, Zhigang
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 2, PROCEEDINGS, 2008, : 497 - 504
  • [28] Classification of signature curves using latent semantic analysis
    Shakiban, C
    Lloyd, R
    [J]. COMPUTER ALGEBRA AND GEOMETRIC ALGEBRA WITH APPLICATIONS, 2005, 3519 : 152 - 162
  • [29] AUTOMATIC DOCUMENT CLASSIFICATION
    BORKO, H
    BERNICK, M
    [J]. JOURNAL OF THE ACM, 1963, 10 (02) : 151 - &
  • [30] An Efficient Method for Document Categorization Based on Word2vec and Latent Semantic Analysis
    Ju, Ronghui
    Zhou, Pan
    Li, Cheng Hua
    Liu, Lijun
    [J]. CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 2280 - 2287