Short Text Similarity Calculation Using Semantic Information

被引:11
|
作者
Pu, Haoyu [1 ]
Fei, Gaolei [1 ]
Zhao, Hailin [1 ]
Hu, Guangmin [1 ]
Jiao, Chengbo [2 ]
Xu, Zhoujun [2 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Sichuan, Peoples R China
[2] Beijing Informat Technol Inst, Beijing, Peoples R China
关键词
short text; semantic similarity; knowledge-based; corpus-based;
D O I
10.1109/BIGCOM.2017.53
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text similarity is one of the important methods of text data analysis, which is often used in text clustering and classification. Social media is a new and popular online social application that contains a lot of valuable information. Short text is common in social media, and short text similarity is often used for social media data mining. The similarity calculation of short text is influenced by the small feature of text words and the accuracy is low. so it is a common improvement method to calculate the similarity of short texts with word semantic similarity. This paper put forward a short text semantic similarity calculation method that combine knowledge-based method and corpus-based method. This method is based on the improved word semantic similarity calculation method and general short text semantic similarity calculation method. The word similarity calculation method combines two word semantic similarity by some strategies. It takes the advantages of two methods to overcome the disadvantages of single one, finds out more semantic association among words in texts, and improves accuracy of word similarity calculation. This paper uses a large number of corpus to compare and analyze several word and text semantic similarity algorithms, the improved method has a closer result to human ratings than other methods in both word and text similarity.
引用
收藏
页码:144 / 150
页数:7
相关论文
共 50 条
  • [1] Short Text Similarity Calculation Based on Jaccard and Semantic Mixture
    Wu, Shushu
    Liu, Fang
    Zhang, Kai
    [J]. Communications in Computer and Information Science, 2021, 1363 CCIS : 37 - 45
  • [2] Measuring the short text similarity based on semantic and syntactic information
    Yang, Jiaqi
    Li, Yongjun
    Gao, Congjie
    Zhang, Yinyin
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 : 169 - 180
  • [3] Combining Statistical Information and Semantic Similarity for Short Text Feature Extension
    Li, Xiaohong
    Su, Yun
    Ma, Huifang
    Cao, Lin
    [J]. INTELLIGENT INFORMATION PROCESSING VIII, 2016, 486 : 205 - 210
  • [4] A Short-Text Similarity Model Combining Semantic and Syntactic Information
    Zhou, Ya
    Li, Cheng
    Huang, Guimin
    Guo, Qingkai
    Li, Hui
    Wei, Xiong
    [J]. ELECTRONICS, 2023, 12 (14)
  • [5] Benchmarking short text semantic similarity
    O'Shea, James
    Bandar, Zuhair
    Crockett, Keeley
    McLean, David
    [J]. International Journal of Intelligent Information and Database Systems, 2010, 4 (02) : 103 - 120
  • [6] A Short Text Similarity Calculation Method Combining Semantic and Headword Attention Mechanism
    Ji, Mingyu
    Zhang, Xinhai
    [J]. SCIENTIFIC PROGRAMMING, 2022, 2022
  • [7] Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space
    Pan, Liqiang
    Zhang, Pu
    Xiong, Anping
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 306 - 310
  • [8] MEASURING SHORT TEXT SEMANTIC SIMILARITY USING MULTIPLE MEASUREMENTS
    Zhu, Tian-Tian
    Lan, Man
    [J]. PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 808 - 813
  • [9] EVALUATION AND CLASSIFICATION OF SYNTAX INFORMATION USAGE IN DETERMINING SHORT TEXT SEMANTIC SIMILARITY
    Batanovic, Vuk
    Bojic, Dragan
    [J]. 2013 21ST TELECOMMUNICATIONS FORUM (TELFOR), 2013, : 821 - 824
  • [10] Study on Text Semantic Similarity in Information Retrieval
    rong, Feng Shao
    jun, Xiao Wen
    [J]. 2008 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, VOLS 1-4, 2008, : 713 - 717