CROSS-LINGUAL CYBERSECURITY ANALYTICS IN THE INTERNATIONAL DARK WEB WITH ADVERSARIAL DEEP REPRESENTATION LEARNING

被引:21
|
作者
Ebrahimi, Mohammadreza [1 ]
Chai, Yidong [2 ]
Samtani, Sagar [3 ]
Chen, Hsinchun [4 ]
机构
[1] Univ S Florida, Sch Informat Syst & Management, Tampa, FL 33620 USA
[2] Hefei Univ Technol, Sch Management, Anhua 230009, Peoples R China
[3] Indiana Univ, Dept Operat & Decis Technol, Bloomington, IN 47405 USA
[4] Univ Arizona, Dept Management Informat Syst, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
Cybersecurity analytics; dark web; automated hacker asset detection; cross-lingual knowledge transfer; adversarial learning; computational design science;
D O I
10.25300/MISQ/2022/16618
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
International dark web platforms operating within multiple geopolitical regions and languages host a myriad of hacker assets such as malware, hacking tools, hacking tutorials, and malicious source code. Cybersecurity analytics organizations employ machine learning models trained on human-labeled data to automatically detect these assets and bolster their situational awareness. However, the lack of human-labeled training data is prohibitive when analyzing foreign-language dark web content. In this research note, we adopt the computational design science paradigm to develop a novel IT artifact for cross-lingual hacker asset detection (CLHAD). CLHAD automatically leverages the knowledge learned from English content to detect hacker assets in non-English dark web platforms. CLHAD encompasses a novel Adversarial deep representation learning (ADREL) method, which generates multilingual text representations using generative adversarial networks (GANs). Drawing upon the state of the art in cross-lingual knowledge transfer, ADREL is a novel approach to automatically extract transferable text representations and facilitate the analysis of multilingual content. We evaluate CLHAD on Russian, French, and Italian dark web platforms and demonstrate its practical utility in hacker asset profiling, and conduct a proof-of-concept case study. Our analysis suggests that cybersecurity managers may benefit more from focusing on Russian to identify sophisticated hacking assets. In contrast, financial hacker assets are scattered among several dominant dark web languages. Managerial insights for security managers are discussed at operational and strategic levels.
引用
收藏
页码:1209 / 1226
页数:18
相关论文
共 50 条
  • [1] Lightweight Cross-Lingual Sentence Representation Learning
    Mao, Zhuoyuan
    Gupta, Prakhar
    Chu, Chenhui
    Jaggi, Martin
    Kurohashi, Sadao
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2902 - 2913
  • [2] Multi-Adversarial Learning for Cross-Lingual Word Embeddings
    Wang, Haozhou
    Henderson, James
    Merlo, Paola
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 463 - 472
  • [3] Transductive Representation Learning for Cross-Lingual Text Classification
    Guo, Yuhong
    Xiao, Min
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 888 - 893
  • [4] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    INTERSPEECH 2021, 2021, : 2426 - 2430
  • [5] Unsupervised cross-lingual word embeddings learning with adversarial training
    Li, Yuling
    Zhang, Yuhong
    Li, Peipei
    Hu, Xuegang
    2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 150 - 156
  • [6] Cross-lingual Speaker Verification with Deep Feature Learning
    Li, Lantian
    Wang, Dong
    Rozi, Askar
    Zheng, Thomas Fang
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1040 - 1044
  • [7] Learning a Cross-Lingual Semantic Representation of Relations Expressed in Text
    Rettinger, Achim
    Schumilin, Artem
    Thoma, Steffen
    Ell, Basil
    SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, ESWC 2015, 2015, 9088 : 337 - 352
  • [8] Adversarial training with Wasserstein distance for learning cross-lingual word embeddings
    Li, Yuling
    Zhang, Yuhong
    Yu, Kui
    Hu, Xuegang
    APPLIED INTELLIGENCE, 2021, 51 (11) : 7666 - 7678
  • [9] Adversarial training with Wasserstein distance for learning cross-lingual word embeddings
    Yuling Li
    Yuhong Zhang
    Kui Yu
    Xuegang Hu
    Applied Intelligence, 2021, 51 : 7666 - 7678
  • [10] Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning
    Zhou, Xinjie
    Wan, Xianjun
    Xiao, Jianguo
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1403 - 1412