A short text modeling method combining semantic and statistical information

被引：59

作者：

Liu Wenyin ^{[1
]}

Quan, Xiaojun ^{[1
]}

Feng, Min ^{[1
]}

Qiu, Bite ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China

来源：

INFORMATION SCIENCES | 2010年 / 180卷 / 20期

关键词：

Text similarity; Short text similarity; Information retrieval; Query expansion; Text mining; Question answering; SIMILARITY; EXTRACTION;

D O I：

10.1016/j.ins.2010.06.021

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A novel modeling method for a collection of short text snippets is presented in this paper to measure the similarity between pairs of snippets. The method takes account of both the semantic and statistical information within the short text snippets, and consists of three steps. Given a set of raw short text snippets, it first establishes the initial similarity between words by using a lexical database. The method then iteratively calculates both word similarity and short text similarity. Finally, a proximity matrix is constructed based on word similarity and used to convert the raw text snippets into vectors. Word similarity and text clustering experiments show that the proposed short text modeling method improves the performance of existing text-related information retrieval (IR) techniques. (C) 2010 Elsevier Inc. All rights reserved.

引用

页码：4031 / 4041

页数：11

共 50 条

[31] Text Infilling Method based on Key Semantic Information Selection Mechanism
Zheng, Shuting
Tian, Wenjing
Cai, Xiaodong
2020 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, COMPUTER TECHNOLOGY AND TRANSPORTATION (ISCTT 2020), 2020, : 219 - 223
[32] SEMANTIC INFORMATION AND STATISTICAL INFERENCE
MENGES, G
BIOMETRISCHE ZEITSCHRIFT, 1972, 14 (06): : 409 - 418
[33] Benchmarking short text semantic similarity
O'Shea J.
Bandar Z.
Crockett K.
McLean D.
International Journal of Intelligent Information and Database Systems, 2010, 4 (02) : 103 - 120
[34] Semantic Enriched Short Text Clustering
Kozlowski, Marek
Rybinski, Henryk
FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 435 - 445
[35] Text Clustering Using Statistical and Semantic Data
Benghabrit, Asmaa
Ouhbi, Brahim
Behja, Hicham
Frikh, Bouchra
WORLD CONGRESS ON COMPUTER & INFORMATION TECHNOLOGY (WCCIT 2013), 2013,
[36] A Short Text Classification Method Based on Convolutional Neural Network and Semantic Extension
Wang, Haitao
Tian, Keke
Wu, Zhengjiang
Wang, Lei
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 367 - 375
[37] A Semantic-based Method of Internet Public Opinion Analysis for Short Text
Hou, Shengluan
Liu, Lei
Cao, Cungen
Yan, Shuying
INTERNATIONAL SYMPOSIUM ON FUZZY SYSTEMS, KNOWLEDGE DISCOVERY AND NATURAL COMPUTATION (FSKDNC 2014), 2014, : 335 - 339
[38] Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space
Pan, Liqiang
Zhang, Pu
Xiong, Anping
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 306 - 310
[39] LDA-PSTR: A Topic Modeling Method for Short Text
Zhou, Kai
Yang, Qun
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2018, 2018, 11323 : 339 - 352
[40] A semi-explicit short text retrieval method combining Wikipedia features
Li, Pu
Li, Tianci
Zhang, Suzhi
Li, Yuhua
Tang, Yong
Jiang, Yuncheng
Engineering Applications of Artificial Intelligence, 2020, 94

← 1 2 3 4 5 →