Automatic document classification based on latent semantic analysis

被引:2
|
作者
Kuralenok, I [1 ]
Nekrest'yanov, I [1 ]
机构
[1] St Petersburg State Univ, St Petersburg 199164, Russia
关键词
Latent Semantic Analysis; Latent Semantic Indexing; Hypothesis Space; Topic Description; Semantic Proximity;
D O I
10.1007/BF02759469
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of documents is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 50 条
  • [1] Automatic document classification based on latent semantic analysis
    I. Kuralenok
    I. Nekrest'yanov
    [J]. Programming and Computer Software, 2000, 26 : 199 - 206
  • [2] Document Classification Method based on Latent Semantic Indexing
    Kim, Jeong-Joon
    Lee, Yong-Soo
    Moon, Jin-Yong
    Park, Jeong-Min
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (04): : 97 - 112
  • [3] Latent Semantic Analysis Boosted Convolutional Neural Networks for Document Classification
    Gultepe, Eren
    Kamkarhaghighi, Mehran
    Makrehchi, Masoud
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 2018, : 93 - 98
  • [4] Semantic Document Classification Based on Semantic Similarity Computation and Correlation Analysis
    Yang, Shuo
    Wei, Ran
    Guo, Jingzhi
    [J]. ADVANCES IN E-BUSINESS ENGINEERING FOR UBIQUITOUS COMPUTING, 2020, 41 : 3 - 18
  • [5] Sentiment Classification of Documents Based on Latent Semantic Analysis
    Wang, Lan
    Wan, Yuan
    [J]. ADVANCED RESEARCH ON COMPUTER EDUCATION, SIMULATION AND MODELING, PT II, 2011, 176 (02): : 356 - +
  • [6] A protein classification method based on Latent Semantic Analysis
    Yuan, Yongsheng
    Lin, Lei
    Dong, Qiwen
    Wang, Xiaolong
    Li, Minghui
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 7738 - 7741
  • [7] A Local Latent Semantic Analysis-based Kernel for Document Similarities
    Aseervatham, Sujeevan
    [J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 214 - 219
  • [8] Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings
    Al-Sabahi, Kamal
    Zhang Zuping
    Kang, Yang
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (01): : 254 - 276
  • [9] Web Text Classification Based on Improved Latent Semantic Analysis
    Wang, Lan
    Wan, Yuan
    [J]. 2011 SECOND ETP/IITA CONFERENCE ON TELECOMMUNICATION AND INFORMATION (TEIN 2011), VOL 1, 2011, : 176 - 179
  • [10] Two-stage Automatic Image Annotation Based on Latent Semantic Scene Classification
    Ge, Hongwei
    Zhang, Kai
    Hou, Yaqing
    Yu, Chao
    Zhao, Mingde
    Wang, Zhen
    Sun, Liang
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,