A short text modeling method combining semantic and statistical information

被引:57
|
作者
Liu Wenyin [1 ]
Quan, Xiaojun [1 ]
Feng, Min [1 ]
Qiu, Bite [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
关键词
Text similarity; Short text similarity; Information retrieval; Query expansion; Text mining; Question answering; SIMILARITY; EXTRACTION;
D O I
10.1016/j.ins.2010.06.021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A novel modeling method for a collection of short text snippets is presented in this paper to measure the similarity between pairs of snippets. The method takes account of both the semantic and statistical information within the short text snippets, and consists of three steps. Given a set of raw short text snippets, it first establishes the initial similarity between words by using a lexical database. The method then iteratively calculates both word similarity and short text similarity. Finally, a proximity matrix is constructed based on word similarity and used to convert the raw text snippets into vectors. Word similarity and text clustering experiments show that the proposed short text modeling method improves the performance of existing text-related information retrieval (IR) techniques. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:4031 / 4041
页数:11
相关论文
共 50 条
  • [41] A semi-explicit short text retrieval method combining Wikipedia features
    Li, Pu
    Li, Tianci
    Zhang, Suzhi
    Li, Yuhua
    Tang, Yong
    Jiang, Yuncheng
    [J]. Engineering Applications of Artificial Intelligence, 2020, 94
  • [42] A semi-explicit short text retrieval method combining Wikipedia features
    Li, Pu
    Li, Tianci
    Zhang, Suzhi
    Li, Yuhua
    Tang, Yong
    Jiang, Yuncheng
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 94
  • [43] Combining semantic and term frequency similarities for text clustering
    Victor Hugo Andrade Soares
    Ricardo J. G. B. Campello
    Seyednaser Nourashrafeddin
    Evangelos Milios
    Murilo Coelho Naldi
    [J]. Knowledge and Information Systems, 2019, 61 : 1485 - 1516
  • [44] Semantic Information Modeling and Implementation Method for Water Conservancy Equipment
    Wang, Songsong
    Xu, Ouguan
    [J]. IEEE ACCESS, 2023, 11 : 133879 - 133890
  • [45] Scientific Text Semantic Structure Modeling
    Bazhenova, Elena
    Tihomirova, Larisa
    Kyrkunova, Larisa
    Danilevskaya, Natalya
    Karpova, Tatyana
    [J]. AMAZONIA INVESTIGA, 2019, 8 (21): : 163 - 167
  • [46] Combining semantic and term frequency similarities for text clustering
    Andrade Soares, Victor Hugo
    Campello, Ricardo J. G. B.
    Nourashrafeddin, Seyednaser
    Milios, Evangelos
    Naldi, Murilo Coelho
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (03) : 1485 - 1516
  • [47] Multi-label Text Classification Method Based on Label Semantic Information
    Xiao, Lin
    Chen, Bo-Li
    Huang, Xin
    Liu, Hua-Feng
    Jing, Li-Ping
    Yu, Jian
    [J]. Ruan Jian Xue Bao/Journal of Software, 2020, 31 (04): : 1079 - 1089
  • [48] A Noval Information Management Method of Text Recognition Based on Ontology Semantic Domain
    Yu Yangxin
    Wang Liuyang
    [J]. STRUCTURAL ENGINEERING, VIBRATION AND AEROSPACE ENGINEERING, 2014, 482 : 335 - 340
  • [49] Short Text Understanding Combining Text Conceptualization and Transformer Embedding
    Li, Jun
    Huang, Guimin
    Chen, Jianheng
    Wang, Yabing
    [J]. IEEE ACCESS, 2019, 7 : 122183 - 122191
  • [50] Semantic modeling and visualization of semantic groups of clinical text documents
    Kenei J.
    Opiyo E.
    [J]. International Journal of Information Technology, 2022, 14 (5) : 2585 - 2593