A Local Latent Semantic Analysis-based Kernel for Document Similarities

被引:0
|
作者
Aseervatham, Sujeevan [1 ]
机构
[1] Univ Paris 13, CNRS, LIPN, UMR 7030, F-93430 Villetaneuse, France
关键词
D O I
10.1109/IJCNN.2008.4633792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The document similarity measure is a key point in textual data processing. It is the main responsible of the performance of a processing system. Since a decade, kernels are used as similarity functions within inner-product based algorithms such as the SVM for NLP problems and especially for text categorization. In this paper, we present a semantic space constructed from latent concepts. The concepts are extracted using the Latent Semantic Analysis (LSA). To take into account of the specificity of each document category, we use the local LSA to define the global semantic space. Furthermore, we propose a weighted semantic kernel for the global space. The experimental results of the kernel, on text categorization tasks, show that this kernel performs better than global LSA kernels and especially for small LSA dimensions.
引用
收藏
页码:214 / 219
页数:6
相关论文
共 50 条
  • [41] Using Latent Semantic Indexing for Morph-based Spoken Document Retrieval
    Turunen, Ville T.
    Kurimo, Mikko
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 341 - 344
  • [42] Model-based document categorization employing semantic pattern analysis and local structure clustering
    Fume, Kosei
    Ishitani, Yasuto
    DOCUMENT RECOGNITION AND RETRIEVAL XV, 2008, 6815
  • [43] Semi-structured document categorization with a semantic kernel
    Aseervatham, Sujeevan
    Bennani, Younes
    PATTERN RECOGNITION, 2009, 42 (09) : 2067 - 2076
  • [44] Semantic Document Classification Based on Semantic Similarity Computation and Correlation Analysis
    Yang, Shuo
    Wei, Ran
    Guo, Jingzhi
    ADVANCES IN E-BUSINESS ENGINEERING FOR UBIQUITOUS COMPUTING, 2020, 41 : 3 - 18
  • [45] Interactive Method for Semantic Document Indexing Based on Explicit Semantic Analysis
    Swieboda, Wojciech
    Krasuski, Adam
    Hung Son Nguyen
    Janusz, Andrzej
    FUNDAMENTA INFORMATICAE, 2014, 132 (03) : 423 - 438
  • [46] How latent is latent semantic analysis?
    Wiemer-Hastings, P
    IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 932 - 937
  • [47] Framework for document retrieval using latent semantic indexing
    Phadnis, Neelam
    Gadge, Jayant
    International Journal of Computers and Applications, 2014, 94 (14) : 37 - 41
  • [48] Latent semantic analysis
    Evangelopoulos, Nicholas E.
    WILEY INTERDISCIPLINARY REVIEWS-COGNITIVE SCIENCE, 2013, 4 (06) : 683 - 692
  • [49] Latent semantic analysis
    Dumais, ST
    ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2004, 38 : 189 - 230
  • [50] Latent Semantic Based Fuzzy Kernel Support Vector Machine for Automatic Content Summarization
    Vetriselvi, T.
    Mayan, J. Albert
    Priyadharshini, K., V
    Sathyamoorthy, K.
    Lakshmi, S. Venkata
    Raja, P. Vishnu
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 34 (03): : 1537 - 1551