Towards Safe Cyber Practices: Developing a Proactive Cyber-Threat Intelligence System for Dark Web Forum Content by Identifying Cybercrimes

被引:2
|
作者
Sangher, Kanti Singh [1 ]
Singh, Archana [2 ]
Pandey, Hari Mohan [3 ]
Kumar, Vivek [4 ]
机构
[1] Ctr Dev Adv Comp, Sch IT, Noida 201307, India
[2] Amity Univ, Amity Sch Engn & Technol, Noida 201313, India
[3] Bournemouth Univ, Dept Comp & Informat, Fern Barrow BH12 5BB, Poole, England
[4] Univ Cagliari, Dept Math & Comp Sci, I-09124 Cagliari, Italy
关键词
dark web forum; cyber security; cybercrimes; deep learning; natural language processing; Agora marketplace; BERT; law enforcement agencies; SILK ROAD; MARKET; IDENTIFICATION; INTERNET; IMPACT; CURVE; DRUGS;
D O I
10.3390/info14060349
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The untraceable part of the Deep Web, also known as the Dark Web, is one of the most used "secretive spaces" to execute all sorts of illegal and criminal activities by terrorists, cybercriminals, spies, and offenders. Identifying actions, products, and offenders on the Dark Web is challenging due to its size, intractability, and anonymity. Therefore, it is crucial to intelligently enforce tools and techniques capable of identifying the activities of the Dark Web to assist law enforcement agencies as a support system. Therefore, this study proposes four deep learning architectures (RNN, CNN, LSTM, and Transformer)-based classification models using the pre-trained word embedding representations to identify illicit activities related to cybercrimes on Dark Web forums. We used the Agora dataset derived from the DarkNet market archive, which lists 109 activities by category. The listings in the dataset are vaguely described, and several data points are untagged, which rules out the automatic labeling of category items as target classes. Hence, to overcome this constraint, we applied a meticulously designed human annotation scheme to annotate the data, taking into account all the attributes to infer the context. In this research, we conducted comprehensive evaluations to assess the performance of our proposed approach. Our proposed BERT-based classification model achieved an accuracy score of 96%. Given the unbalancedness of the experimental data, our results indicate the advantage of our tailored data preprocessing strategies and validate our annotation scheme. Thus, in real-world scenarios, our work can be used to analyze Dark Web forums and identify cybercrimes by law enforcement agencies and can pave the path to develop sophisticated systems as per the requirements.
引用
收藏
页数:20
相关论文
共 16 条
  • [1] Dark-Net Ecosystem Cyber-Threat Intelligence (CTI) Tool
    Arnold, Nolan
    Ebrahimi, Mohammadreza
    Zhang, Ning
    Lazarine, Ben
    Patton, Mark
    Chen, Hsinchun
    Samtani, Sagar
    2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 92 - 97
  • [2] A crawler architecture for harvesting the clear, social, and dark web for IoT-related cyber-threat intelligence
    Koloveas, Paris
    Chantzios, Thanasis
    Tryfonopoulos, Christos
    Skiadopoulos, Spiros
    2019 IEEE WORLD CONGRESS ON SERVICES (IEEE SERVICES 2019), 2019, : 3 - 8
  • [3] Towards Selecting Informative Content for Cyber Threat Intelligence
    Panagiotou, Panos
    Iliou, Christos
    Apostolou, Konstantinos
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Chatzimisios, Periklis
    Kompatsiaris, Ioannis
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 354 - 359
  • [4] Dark-Web Cyber Threat Intelligence: From Data to Intelligence to Prediction
    Shakarian, Paulo
    INFORMATION, 2018, 9 (12):
  • [5] Exploring the Dark Web for Cyber Threat Intelligence using Machine Leaning
    Kadoguchi, Masashi
    Hayashi, Shota
    Hashimoto, Masaki
    Otsuka, Akira
    2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 200 - 202
  • [6] Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat Intelligence
    Zhang, Ning
    Ebrahimi, Mohammadreza
    Li, Weifeng
    Chen, Hsinchun
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (02)
  • [7] Incremental Hacker Forum Exploit Collection and Classification for Proactive Cyber Threat Intelligence: An Exploratory Study
    Williams, Ryan
    Samtani, Sagar
    Patton, Mark
    Chen, Hsinchun
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 94 - 99
  • [8] Identifying Mobile Malware and Key Threat Actors in Online Hacker Forums for Proactive Cyber Threat Intelligence
    Grisham, John
    Samtani, Sagar
    Patton, Mark
    Chen, Hsinchun
    2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 13 - 18
  • [9] An Approach of Privacy-Preserved PQC-based Cyber-threat Intelligence System
    Chen, Yu-Jen
    Lin, Tzu-Wei
    Kuo, Chung-Wei
    Tsai, Kuo-Yu
    2024 8TH INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, SECURITY AND PRIVACY, CSP 2024, 2024, : 1 - 4
  • [10] Deep Self-Supervised Clustering of the Dark Web for Cyber Threat Intelligence
    Kadoguchi, Masashi
    Kobayashi, Hanae
    Hayashi, Shota
    Otsuka, Akira
    Hashimoto, Masaki
    2020 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2020, : 163 - 168