Mining criminal networks from unstructured text documents

被引:40
|
作者
Al-Zaidy, Rabeah [1 ]
Fung, Benjamin C. M. [1 ]
Youssef, Amr M. [1 ]
Fortin, Francis [2 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] Surete Quebec, Montreal, PQ, Canada
关键词
Forensic analysis; Data mining; Hypothesis generation; Criminal network; Information retrieval;
D O I
10.1016/j.diin.2011.12.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Digital data collected for forensics analysis often contain valuable information about the suspects' social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for further investigation by using various criminal network analysis tools. Obviously, this information extraction process is tedious and error-prone. Moreover, the quality of the analysis varies by the experience and expertise of the investigator. In this paper, we propose a systematic method to discover criminal networks from a collection of text documents obtained from a suspect's machine, extract useful information for investigation, and then visualize the suspect's criminal network. Furthermore, we present a hypothesis generation approach to identify potential indirect relationships among the members in the identified networks. We evaluated the effectiveness and performance of the method on a real-life cybercrimine case and some other datasets. The proposed method, together with the implemented software tool, has received positive feedback from the digital forensics team of a law enforcement unit in Canada. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:147 / 160
页数:14
相关论文
共 50 条
  • [41] Term Weighting using Contextual Information for Categorization of Unstructured Text Documents
    Kulkarni, Anagha
    Tokekar, Vrinda
    Kulkarni, Parag
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [42] Text mining: Extraction of interesting association rule with frequent itemsets mining for Korean language from unstructured data
    Department of computer Engineering, INU , Incheon, Korea, Republic of
    [J]. Int. J. Multimedia Ubiquitous Eng., 11 (11-20):
  • [43] The text mining handbook: Advanced approaches to analyzing unstructured data
    Mihalcea, Rada
    [J]. COMPUTATIONAL LINGUISTICS, 2008, 34 (01) : 125 - 127
  • [44] Assessment of Congruence of Unstructured Data Using Text Mining Technology
    Kovtun, Denis
    [J]. 2021 IEEE 23RD CONFERENCE ON BUSINESS INFORMATICS, CBI 2021, VOL 2, 2021, : 163 - 166
  • [45] Text classification algorithms for mining unstructured data: a SWOT analysis
    Kumar A.
    Dabas V.
    Hooda P.
    [J]. International Journal of Information Technology, 2020, 12 (4) : 1159 - 1169
  • [46] Documents, Topics, and Authors: Text Mining of Online News
    Sertkan, Mete
    Neidhardt, Julia
    Werthner, Hannes
    [J]. 2019 IEEE 21ST CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 1, 2019, : 405 - 413
  • [47] Text Mining and Sentiment Extraction in Central Bank Documents
    Bruno, Giuseppe
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1700 - 1708
  • [48] Text mining using the hierarchical syntactical structure of documents
    Danger, R
    Ruíz-Shulcloper, J
    Berlanga, R
    [J]. CURRENT TOPICS IN ARTIFICIAL INTELLIGENCE, 2004, 3040 : 556 - 565
  • [49] Theoretical considerations of ethics in text mining of nursing documents
    Suominen, Hanna
    Lehtikunnas, Tuija
    Back, Barbro
    Karstena, Helena
    Salakoski, Tapio
    Salantera, Sanna
    [J]. CONSUMER-CENTERED COMPUTER-SUPPPORTED CARE FOR HEALTHY PEOPLE, 2006, 122 : 359 - +
  • [50] Text Mining Documents in Electronic Data Interchange Environment
    Zubi, Zakaria Suliman
    [J]. RECENT ADVANCES IN NEURAL NETWORKS, FUZZY SYSTEMS & EVOLUTIONARY COMPUTING, 2010, : 76 - 88