Language independent semantic kernels for short-text classification

被引:40
|
作者
Kim, Kwanho [1 ]
Chung, Beom-suk [2 ]
Choi, Yerim [2 ]
Lee, Seungjun [2 ]
Jung, Jae-Yoon [1 ]
Park, Jonghun [2 ]
机构
[1] Kyung Hee Univ, Dept Ind & Management Syst Engn, Yongin 446701, Gyeonggi, South Korea
[2] Seoul Natl Univ, Dept Ind Engn, Seoul 151744, South Korea
基金
新加坡国家研究基金会;
关键词
Short-text document classification; Kernel method; Similarity measure; Language independent semantic kernel; SVM;
D O I
10.1016/j.eswa.2013.07.097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short-text classification is increasingly used in a wide range of applications. However, it still remains a challenging problem due to the insufficient nature of word occurrences in short-text documents, although some recently developed methods which exploit syntactic or semantic information have enhanced performance in short-text classification. The language-dependency problem, however, caused by the heavy use of grammatical tags and lexical databases, is considered the major drawback of the previous methods when they are applied to applications in diverse languages. In this article, we propose a novel kernel, called language independent semantic (LIS) kernel, which is able to effectively compute the similarity between short-text documents without using grammatical tags and lexical databases. From the experiment results on English and Korean datasets, it is shown that the LIS kernel has better performance than several existing kernels. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:735 / 743
页数:9
相关论文
共 50 条
  • [1] Review of short-text classification
    Alsmadi, Issa
    Gan, Keng Hoon
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2019, 15 (02) : 155 - 182
  • [2] Transductive learning for short-text classification problems using latent semantic indexing
    Zelikovitz, S
    Marquez, F
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (02) : 143 - 163
  • [3] Few-shot short-text classification with language representations and centroid similarity
    Liu, Wenfu
    Pang, Jianmin
    Li, Nan
    Yue, Feng
    Liu, Guangming
    [J]. APPLIED INTELLIGENCE, 2023, 53 (07) : 8061 - 8072
  • [4] Few-shot short-text classification with language representations and centroid similarity
    Wenfu Liu
    Jianmin Pang
    Nan Li
    Feng Yue
    Guangming Liu
    [J]. Applied Intelligence, 2023, 53 : 8061 - 8072
  • [5] Short-text classification based on ICA and LSA
    Pu, Qiang
    Yang, Guo-Wei
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 3972 : 265 - 270
  • [6] Intent Classification of Short-Text on Social Media
    Purohit, Hemant
    Dong, Guozhu
    Shalin, Valerie
    Thirunarayan, Krishnaprasad
    Sheth, Amit
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 222 - 228
  • [7] A Comparative Analysis of Strategies for Semantic Short-Text Categorization
    Rosas, Maria V.
    Errecalde, Marcelo L.
    Rosso, Paolo
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 11 - 18
  • [8] ClassiNet - Predicting Missing Features for Short-Text Classification
    Bollegala, Dan Ushka
    Atanasov, Vincent
    Maehara, Takanori
    Kawarabayashi, Ken-Ichi
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (05)
  • [9] A Short-Text Similarity Model Combining Semantic and Syntactic Information
    Zhou, Ya
    Li, Cheng
    Huang, Guimin
    Guo, Qingkai
    Li, Hui
    Wei, Xiong
    [J]. ELECTRONICS, 2023, 12 (14)
  • [10] Knowledge Guided Short-Text Classification For Healthcare Applications
    Cao, Shilei
    Qian, Buyue
    Yin, Changchang
    Li, Xiaoyu
    Wei, Jishang
    Zheng, Qinghua
    Davidson, Ian
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 31 - 40