Automatic document classification based on latent semantic analysis

被引:0
|
作者
I. Kuralenok
I. Nekrest'yanov
机构
[1] St. Petersburg State University,
来源
关键词
Latent Semantic Analysis; Latent Semantic Indexing; Hypothesis Space; Topic Description; Semantic Proximity;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of document is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.
引用
收藏
页码:199 / 206
页数:7
相关论文
共 50 条
  • [31] A comparison of latent semantic analysis and correspondence analysis of document-term matrices
    Qi, Qianqian
    Hessen, David J.
    Deoskar, Tejaswini
    van der Heijden, Peter G. M.
    [J]. NATURAL LANGUAGE ENGINEERING, 2024, 30 (04) : 722 - 752
  • [32] Automatic image annotation with relevance feedback and latent semantic analysis
    Morrison, Donn
    Marchand-Maillet, Stephane
    Bruno, Eric
    [J]. ADAPTIVE MULTIMEDIAL RETRIEVAL: RETRIEVAL, USER, AND SEMANTICS, 2008, 4918 : 71 - 84
  • [33] Incremental Probabilistic Latent Semantic Analysis for Automatic Question Recommendation
    Wu, Hu
    Wang, Yongji
    Cheng, Xiang
    [J]. RECSYS'08: PROCEEDINGS OF THE 2008 ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2008, : 99 - 106
  • [34] Automatic Answer Assessment in LMS using Latent Semantic Analysis
    Thomas, N. T.
    Kumar, Ashwini
    Bijlani, Kamal
    [J]. SECOND INTERNATIONAL SYMPOSIUM ON COMPUTER VISION AND THE INTERNET (VISIONNET'15), 2015, 58 : 257 - 264
  • [35] Research on multi-document summarization based on latent semantic indexing
    秦兵
    刘挺
    张宇
    李生
    [J]. Journal of Harbin Institute of Technology(New series), 2005, (01) : 91 - 94
  • [36] Research on multi-document summarization based on latent semantic indexing
    Qin, Bing
    Liu, Ting
    Zhang, Yu
    Li, Sheng
    [J]. Journal of Harbin Institute of Technology (New Series), 2005, 12 (01) : 91 - 94
  • [37] A Latent Semantic Indexing-based approach to multilingual document clustering
    Wei, Chih-Ping
    Yang, Christopher C.
    Lin, Chia-Min
    [J]. DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 606 - 620
  • [38] AUTOMATIC DOCUMENT CLASSIFICATION BASED ON EXPERT HUMAN DECISIONS
    CAHN, DF
    HERR, JJ
    [J]. PROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1978, 15 : 63 - 66
  • [39] Automatic Classification of Document Resources Based on Naive Bayesian Classification Algorithm
    Wang, Rong
    [J]. INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS, 2022, 46 (03): : 373 - 382
  • [40] Semi-supervised learning based probabilistic latent semantic analysis for automatic image annotation
    田东平
    [J]. High Technology Letters, 2017, 23 (04) : 367 - 374