A Short Text Similarity Algorithm for Finding Similar Police 110 Incidents

被引:0
|
作者
Duan, Lei [1 ]
Xu, Tongge [2 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
来源
2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD) | 2016年
关键词
short text similarity; word embedding; police intelligence; CRIME; FRAMEWORK; COPLINK;
D O I
10.1109/CCBD.2016.22
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Finding similar police 110 incidents from the incident dataset plays an important role in recognising related cases from which the investigators could find more clues and make a better decision on police deployment. We aim at finding 110 incidents with similar case features and semantic compared against a given incident. A short text similarity algorithm is presented. Our algorithm is developed from a novel semantic similarity algorithm Word Mover'd Distance(WMD). In order to emphasize the significance of case features in incident text, the method introduces the traditional term frequency-inverted document frequency(TF-IDF) as term weights to the WMD. Then the algorithm is verified on the practical dataset of public security department to find similar incidents, and experiments show that the algorithm is effective and can improve the accuracy in finding similar police incidents.
引用
收藏
页码:260 / 264
页数:5
相关论文
共 50 条
  • [1] An algorithm for semantic similarity of short text based on WordNet
    Zhai, Yan-Dong
    Wang, Kang-Ping
    Zhang, Dong-Na
    Hunag, Lan
    Zhou, Chun-Guang
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2012, 40 (03): : 617 - 620
  • [2] Measuring Source Code Similarity by Finding Similar Subgraph with an Incremental Genetic Algorithm
    Kim, Jinhyun
    Choi, HyukGeun
    Yun, Hansang
    Moon, Byung-Ro
    GECCO'16: PROCEEDINGS OF THE 2016 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2016, : 925 - 932
  • [3] A chinese short text similarity algorithm based on semantic and syntax
    Liao, Zhi-Fang (zfliao@csu.edu.cn), 1600, Hunan University (43):
  • [4] Learning Text Representations for Finding Similar Exercises
    Feng, Mengfei
    Chen, Yishuai
    Guo, Yuchun
    Zhao, Yongxiang
    Fu, Guowei
    2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,
  • [5] Finding Similar Files Using Text Mining
    Asanka, P. P. G. Dinesh
    PROCEEDINGS OF THE 2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2013), 2013, : 431 - 435
  • [6] An effective short text conceptualization based on new short text similarity
    Bekkali, Mohammed
    Lachkar, Abdelmonaime
    SOCIAL NETWORK ANALYSIS AND MINING, 2018, 9 (01)
  • [7] Similarity measures for short segments of text
    Metzler, Donald
    Dumais, Susan
    Meek, Christopher
    ADVANCES IN INFORMATION RETRIEVAL, 2007, 4425 : 16 - +
  • [8] Benchmarking short text semantic similarity
    O'Shea J.
    Bandar Z.
    Crockett K.
    McLean D.
    International Journal of Intelligent Information and Database Systems, 2010, 4 (02) : 103 - 120
  • [9] An efficient algorithm for finding similar short substrings from large scale string data
    Uno, Takeaki
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 345 - 356
  • [10] A Novel Semi-supervised Short Text Classification Algorithm Based on Fusion Similarity
    Li, Xiaohong
    Yan, Li
    Qin, Na
    Ran, Hongyan
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2017, PT III, 2017, 10363 : 309 - 319