A Short Text Similarity Algorithm for Finding Similar Police 110 Incidents

被引：0

作者：

Duan, Lei ^{[1
]}

Xu, Tongge ^{[2
]}

机构：

[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China

[2] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China

来源：

2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD) | 2016年

关键词：

short text similarity; word embedding; police intelligence; CRIME; FRAMEWORK; COPLINK;

D O I：

10.1109/CCBD.2016.22

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Finding similar police 110 incidents from the incident dataset plays an important role in recognising related cases from which the investigators could find more clues and make a better decision on police deployment. We aim at finding 110 incidents with similar case features and semantic compared against a given incident. A short text similarity algorithm is presented. Our algorithm is developed from a novel semantic similarity algorithm Word Mover'd Distance(WMD). In order to emphasize the significance of case features in incident text, the method introduces the traditional term frequency-inverted document frequency(TF-IDF) as term weights to the WMD. Then the algorithm is verified on the practical dataset of public security department to find similar incidents, and experiments show that the algorithm is effective and can improve the accuracy in finding similar police incidents.

引用

页码：260 / 264

页数：5

共 50 条

[31] A Short Text Similarity Measure Based on Hidden Topics
Chen, Hong-chao
Guo, Xiao-hua
Liu, Ling-qiang
Zhu, Xin-hua
COMPUTER SCIENCE AND TECHNOLOGY (CST2016), 2017, : 1101 - 1108
[32] Short Text Similarity Calculation Using Semantic Information
Pu, Haoyu
Fei, Gaolei
Zhao, Hailin
Hu, Guangmin
Jiao, Chengbo
Xu, Zhoujun
2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 144 - 150
[33] Improving Short Text Clustering by Similarity Matrix Sparsification
Rakib, Md Rashadul Hasan
Jankowska, Magdalena
Zeh, Norbert
Milios, Evangelos
PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,
[34] Mining Summary of Short Text with Centroid Similarity Distance
Franciscus, Nigel
Wang, Junhu
Stantic, Bela
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 447 - 461
[35] An Improved News Recommendation Algorithm Based on Text Similarity
Gao, Yihang
Zhao, Hui
Zhou, Qian
Qiu, Meikang
Liu, Meiqin
2020 3RD INTERNATIONAL CONFERENCE ON SMART BLOCKCHAIN (SMARTBLOCK), 2020, : 132 - 136
[36] Mapping Texts Into Graphs: An Improved Text Similarity Algorithm
Liu, Zuoguo
Chen, Xiaorong
PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1357 - 1361
[37] An Improved Text Similarity Calculation Algorithm Based On VSM
Li, Lian
Zhu, AiHong
Su, Tao
ADVANCED RESEARCH ON AUTOMATION, COMMUNICATION, ARCHITECTONICS AND MATERIALS, PTS 1 AND 2, 2011, 225-226 (1-2): : 1105 - 1108
[38] A fuzzy clustering approach for finding similar documents using a novel similarity measure
Saracoglu, Ridvan
Tutuncu, Kemal
Allahverdi, Novruz
EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (03) : 600 - 605
[39] Research on Keyword-Overlap Similarity Algorithm Optimization in Short English Text Based on Lexical Chunk Theory
Li, Na
Li, Cheng
Zhang, Honglie
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (05): : 631 - 640
[40] Detecting short passages of similar text in large document collections
Lyon, C
Malcolm, J
Dickerson, B
PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 118 - 125

← 1 2 3 4 5 →