A short text modeling method combining semantic and statistical information

被引:57
|
作者
Liu Wenyin [1 ]
Quan, Xiaojun [1 ]
Feng, Min [1 ]
Qiu, Bite [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
关键词
Text similarity; Short text similarity; Information retrieval; Query expansion; Text mining; Question answering; SIMILARITY; EXTRACTION;
D O I
10.1016/j.ins.2010.06.021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A novel modeling method for a collection of short text snippets is presented in this paper to measure the similarity between pairs of snippets. The method takes account of both the semantic and statistical information within the short text snippets, and consists of three steps. Given a set of raw short text snippets, it first establishes the initial similarity between words by using a lexical database. The method then iteratively calculates both word similarity and short text similarity. Finally, a proximity matrix is constructed based on word similarity and used to convert the raw text snippets into vectors. Word similarity and text clustering experiments show that the proposed short text modeling method improves the performance of existing text-related information retrieval (IR) techniques. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:4031 / 4041
页数:11
相关论文
共 50 条
  • [1] Combining Statistical Information and Semantic Similarity for Short Text Feature Extension
    Li, Xiaohong
    Su, Yun
    Ma, Huifang
    Cao, Lin
    [J]. INTELLIGENT INFORMATION PROCESSING VIII, 2016, 486 : 205 - 210
  • [2] A Short-Text Similarity Model Combining Semantic and Syntactic Information
    Zhou, Ya
    Li, Cheng
    Huang, Guimin
    Guo, Qingkai
    Li, Hui
    Wei, Xiong
    [J]. ELECTRONICS, 2023, 12 (14)
  • [3] Microblog Short Text Semantic Modeling Method for Search
    Kou, Fei-Fei
    Du, Jun-Ping
    Shi, Yan-Song
    Yang, Cong-Xian
    Cui, Wan-Qiu
    Liang, Mei-Yu
    Shi, Lei
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (05): : 781 - 795
  • [4] A Short Text Similarity Calculation Method Combining Semantic and Headword Attention Mechanism
    Ji, Mingyu
    Zhang, Xinhai
    [J]. SCIENTIFIC PROGRAMMING, 2022, 2022
  • [5] Combining information in statistical modeling
    Pena, D
    [J]. AMERICAN STATISTICIAN, 1997, 51 (04): : 326 - 332
  • [6] A Robust Method: Arbitrary Shape Text Detection Combining Semantic and Position Information
    Wang, Zhenchao
    Silamu, Wushour
    Li, Yuze
    Xu, Miaomiao
    [J]. SENSORS, 2022, 22 (24)
  • [7] AN APPROACH FOR COMBINING SEMANTIC INFORMATION AND PROXIMITY INFORMATION FOR TEXT SUMMARIZATION
    Jeong, Hogyeong
    Yun, Yeogirl
    [J]. KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 427 - 432
  • [8] Combining Lexical and Semantic Features for Short Text Classification
    Yang, Lili
    Li, Chunping
    Ding, Qiang
    Li, Li
    [J]. 17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 78 - 86
  • [9] A text similarity measurement combining word semantic information with TF-IDF method
    Huang, Cheng-Hui
    Yin, Jian
    Hou, Fang
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2011, 34 (05): : 856 - 864
  • [10] Short Text Similarity Calculation Using Semantic Information
    Pu, Haoyu
    Fei, Gaolei
    Zhao, Hailin
    Hu, Guangmin
    Jiao, Chengbo
    Xu, Zhoujun
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 144 - 150