Fast Extraction of Semantic Features from a Latent Semantic Indexed Text Corpus

被引:0
|
作者
A. Kabán
M. A. Girolami
机构
[1] Helsinki University of Technology,Laboratory of Computer and Information Science
来源
Neural Processing Letters | 2002年 / 15卷
关键词
latent semantic indexing; probabilistic latent semantic analysis; projection pursuit; semantic feature extraction; text analysis;
D O I
暂无
中图分类号
学科分类号
摘要
This paper proposes a projection-based symmetrical factorisation method for extracting semantic features from collections of text documents stored in a Latent Semantic space. Preliminary experimental results demonstrate this yields a comparable representation to that provided by a novel probabilistic approach which reconsiders the entire indexing problem of text documents and works directly in the original high dimensional vector-space representation of text. The employed projection index is derived here from the a priori constraints on the problem. The principal advantage of this approach is computational efficiency and is obtained by the exploitation of the Latent Semantic Indexing as a preprocessing stage. Simulation results on subsets of the 20-Newsgroups text corpus in various settings are provided.
引用
收藏
页码:31 / 43
页数:12
相关论文
共 50 条
  • [1] Fast extraction of semantic features from a latent semantic indexed text corpus
    Kabán, A
    Girolami, MA
    [J]. NEURAL PROCESSING LETTERS, 2002, 15 (01) : 31 - 34
  • [2] An efficient framework of utilizing the latent semantic analysis in text extraction
    Ababneh, Ahmad Hussein
    Lu, Joan
    Xu, Qiang
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 785 - 815
  • [3] An efficient framework of utilizing the latent semantic analysis in text extraction
    Ahmad Hussein Ababneh
    Joan Lu
    Qiang Xu
    [J]. International Journal of Speech Technology, 2019, 22 : 785 - 815
  • [4] Latent Semantic Analysis: An Approach to Understand Semantic of Text
    Kherwa, Pooja
    Bansal, Poonam
    [J]. 2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 870 - 874
  • [5] Spam filtering based on supervised latent semantic features extraction
    Zeng, Qingpeng
    Wu, Shuixiu
    Wang, Mingwen
    [J]. Journal of Computational Information Systems, 2008, 4 (03): : 1299 - 1306
  • [6] From corpus to lexicon:: From contexts to semantic features
    Pichon, R
    Sébillot, P
    [J]. PALC'99: PRACTICAL APPLICATIONS IN LANGUAGE CORPORA, 2000, 1 : 375 - 389
  • [7] FAST LATENT SEMANTIC INDEX USING RANDOM MAPPING IN TEXT PROCESSING
    Qian, Xiao-Dong
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1 AND 2, 2008, : 788 - 792
  • [8] Fast latent semantic indexing in text processing based on random mapping
    Qian, Xiao-Dong
    Wang, Zheng-Ou
    [J]. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2005, 38 (04): : 372 - 376
  • [9] A LINGUISTIC APPROACH TO SEMANTIC EXTRACTION FROM TEXT
    Browarnik, Abel
    Maimon, Oded
    [J]. RAEL-REVISTA ELECTRONICA DE LINGUISTICA APLICADA, 2011, 10 (01): : 9 - 29
  • [10] Using Latent Semantic Analysis and the Predication Algorithm to Improve Extraction of Meanings from a Diagnostic Corpus
    Jorge-Botana, Guillermo
    Olmos, Ricardo
    Antonio Leon, Jose
    [J]. SPANISH JOURNAL OF PSYCHOLOGY, 2009, 12 (02): : 424 - 440