A Short Text Similarity Algorithm for Finding Similar Police 110 Incidents

被引:0
|
作者
Duan, Lei [1 ]
Xu, Tongge [2 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
来源
2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD) | 2016年
关键词
short text similarity; word embedding; police intelligence; CRIME; FRAMEWORK; COPLINK;
D O I
10.1109/CCBD.2016.22
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Finding similar police 110 incidents from the incident dataset plays an important role in recognising related cases from which the investigators could find more clues and make a better decision on police deployment. We aim at finding 110 incidents with similar case features and semantic compared against a given incident. A short text similarity algorithm is presented. Our algorithm is developed from a novel semantic similarity algorithm Word Mover'd Distance(WMD). In order to emphasize the significance of case features in incident text, the method introduces the traditional term frequency-inverted document frequency(TF-IDF) as term weights to the WMD. Then the algorithm is verified on the practical dataset of public security department to find similar incidents, and experiments show that the algorithm is effective and can improve the accuracy in finding similar police incidents.
引用
收藏
页码:260 / 264
页数:5
相关论文
共 50 条
  • [31] A Short Text Similarity Measure Based on Hidden Topics
    Chen, Hong-chao
    Guo, Xiao-hua
    Liu, Ling-qiang
    Zhu, Xin-hua
    COMPUTER SCIENCE AND TECHNOLOGY (CST2016), 2017, : 1101 - 1108
  • [32] Short Text Similarity Calculation Using Semantic Information
    Pu, Haoyu
    Fei, Gaolei
    Zhao, Hailin
    Hu, Guangmin
    Jiao, Chengbo
    Xu, Zhoujun
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 144 - 150
  • [33] Improving Short Text Clustering by Similarity Matrix Sparsification
    Rakib, Md Rashadul Hasan
    Jankowska, Magdalena
    Zeh, Norbert
    Milios, Evangelos
    PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,
  • [34] Mining Summary of Short Text with Centroid Similarity Distance
    Franciscus, Nigel
    Wang, Junhu
    Stantic, Bela
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 447 - 461
  • [35] An Improved News Recommendation Algorithm Based on Text Similarity
    Gao, Yihang
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    2020 3RD INTERNATIONAL CONFERENCE ON SMART BLOCKCHAIN (SMARTBLOCK), 2020, : 132 - 136
  • [36] Mapping Texts Into Graphs: An Improved Text Similarity Algorithm
    Liu, Zuoguo
    Chen, Xiaorong
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1357 - 1361
  • [37] An Improved Text Similarity Calculation Algorithm Based On VSM
    Li, Lian
    Zhu, AiHong
    Su, Tao
    ADVANCED RESEARCH ON AUTOMATION, COMMUNICATION, ARCHITECTONICS AND MATERIALS, PTS 1 AND 2, 2011, 225-226 (1-2): : 1105 - 1108
  • [38] A fuzzy clustering approach for finding similar documents using a novel similarity measure
    Saracoglu, Ridvan
    Tutuncu, Kemal
    Allahverdi, Novruz
    EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (03) : 600 - 605
  • [39] Research on Keyword-Overlap Similarity Algorithm Optimization in Short English Text Based on Lexical Chunk Theory
    Li, Na
    Li, Cheng
    Zhang, Honglie
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (05): : 631 - 640
  • [40] Detecting short passages of similar text in large document collections
    Lyon, C
    Malcolm, J
    Dickerson, B
    PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 118 - 125