Mining criminal networks from unstructured text documents

被引:40
|
作者
Al-Zaidy, Rabeah [1 ]
Fung, Benjamin C. M. [1 ]
Youssef, Amr M. [1 ]
Fortin, Francis [2 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] Surete Quebec, Montreal, PQ, Canada
关键词
Forensic analysis; Data mining; Hypothesis generation; Criminal network; Information retrieval;
D O I
10.1016/j.diin.2011.12.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Digital data collected for forensics analysis often contain valuable information about the suspects' social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for further investigation by using various criminal network analysis tools. Obviously, this information extraction process is tedious and error-prone. Moreover, the quality of the analysis varies by the experience and expertise of the investigator. In this paper, we propose a systematic method to discover criminal networks from a collection of text documents obtained from a suspect's machine, extract useful information for investigation, and then visualize the suspect's criminal network. Furthermore, we present a hypothesis generation approach to identify potential indirect relationships among the members in the identified networks. We evaluated the effectiveness and performance of the method on a real-life cybercrimine case and some other datasets. The proposed method, together with the implemented software tool, has received positive feedback from the digital forensics team of a law enforcement unit in Canada. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:147 / 160
页数:14
相关论文
共 50 条
  • [1] A Software to Extract Criminal Networks from Unstructured Text in Spanish; the Case of Peruvian Criminal Networks
    Silvestre Castillo, Raul
    [J]. COMPLEX NETWORKS AND THEIR APPLICATIONS VII, VOL 2, 2019, 813 : 3 - 15
  • [2] Mining Association Rules from Unstructured Documents
    Mahgoub, Hany
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 14, 2006, 14 : 167 - 172
  • [3] Exhaustive mining of information from unstructured documents
    Soubbotin, Martin
    Soubbotin, Sergei
    [J]. WMSCI 2005: 9TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL 1, 2005, : 210 - 215
  • [4] Application of predictive and descriptive text mining techniques for analysis and organization of unstructured documents
    Forest, Dominic
    [J]. CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2007, 30 (1-2): : 96 - 96
  • [5] Mining rough association from text documents
    Li, Yuefeng
    Zhong, Ning
    [J]. ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2006, 4259 : 368 - 377
  • [6] Mining relevant text from unlabelled documents
    Barbará, D
    Domeniconi, C
    Kang, N
    [J]. THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 489 - 492
  • [7] Mining Opinion from Text Documents: A Survey
    Khan, Khairullah
    Baharudin, Baharum B.
    Khan, Aurangzeb
    Fazal-e-Malik
    [J]. 2009 3RD IEEE INTERNATIONAL CONFERENCE ON DIGITAL ECOSYSTEMS AND TECHNOLOGIES, 2009, : 194 - 199
  • [8] Automated ontology construction for unstructured text documents
    Lee, Chang-Shing
    Kao, Yuan-Fang
    Kuo, Yau-Hwang
    Wang, Mei-Hui
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 60 (03) : 547 - 566
  • [9] Deep Text Mining for Automatic Keyphrase Extraction from Text Documents
    Abulaish, Muhammad
    Jahiruddin
    Dey, Lipika
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2011, 20 (04) : 327 - 351
  • [10] Extracting Body Text from Academic PDF Documents for Text Mining
    Yu, Changfeng
    Zhang, Cheng
    Wang, Jie
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 235 - 242