A Local Latent Semantic Analysis-based Kernel for Document Similarities

被引:0
|
作者
Aseervatham, Sujeevan [1 ]
机构
[1] Univ Paris 13, CNRS, LIPN, UMR 7030, F-93430 Villetaneuse, France
关键词
D O I
10.1109/IJCNN.2008.4633792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The document similarity measure is a key point in textual data processing. It is the main responsible of the performance of a processing system. Since a decade, kernels are used as similarity functions within inner-product based algorithms such as the SVM for NLP problems and especially for text categorization. In this paper, we present a semantic space constructed from latent concepts. The concepts are extracted using the Latent Semantic Analysis (LSA). To take into account of the specificity of each document category, we use the local LSA to define the global semantic space. Furthermore, we propose a weighted semantic kernel for the global space. The experimental results of the kernel, on text categorization tasks, show that this kernel performs better than global LSA kernels and especially for small LSA dimensions.
引用
收藏
页码:214 / 219
页数:6
相关论文
共 50 条
  • [1] A latent semantic analysis-based image tag optimisation method
    Cai A.
    Cai, Aiping (caiaiping2010@163.com), 1600, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (13): : 109 - 121
  • [2] Automatic document classification based on latent semantic analysis
    I. Kuralenok
    I. Nekrest'yanov
    Programming and Computer Software, 2000, 26 : 199 - 206
  • [3] Automatic document classification based on latent semantic analysis
    Kuralenok, I
    Nekrest'yanov, I
    PROGRAMMING AND COMPUTER SOFTWARE, 2000, 26 (04) : 199 - 206
  • [4] A Latent Semantic Analysis-based Approach to Geographic Feature Categorization from Text
    Huang, Yuxia
    FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 87 - 94
  • [5] Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings
    Al-Sabahi, Kamal
    Zhang Zuping
    Kang, Yang
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (01): : 254 - 276
  • [6] Kvasir: Seamless Integration of Latent Semantic Analysis-Based Content Provision into Web Browsing
    Wang, Liang
    Tasoulis, Sotiris
    Roos, Teemu
    Kangasharju, Jussi
    WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 251 - 254
  • [7] A context tree kernel based on latent semantic topic
    Xu C.
    Zhou Y.-M.
    Shen L.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2010, 32 (11): : 2695 - 2700
  • [8] A New Approach for Multi-Document Summarization based on Latent Semantic Analysis
    Xiong, Shuchu
    Luo, Yihui
    2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 1, 2014, : 177 - 180
  • [9] A kernel for measuring structural semantic similarities
    Li, Shu Jie
    Wei, Jin Mao
    Wang, Shu Qin
    Wang, Guo Ying
    2006 IMACS: MULTICONFERENCE ON COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, VOLS 1 AND 2, 2006, : 1736 - +
  • [10] Probabilistic Latent Semantic Analysis-Based Gear Fault Diagnosis Under Variable Working Conditions
    Chen, Chao
    Shen, Fei
    Xu, Jiawen
    Yan, Ruqiang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2020, 69 (06) : 2845 - 2857