Automated Threat Report Classification Over Multi-Source Data

被引:17
|
作者
Ayoade, Gbadebo [1 ]
Chandra, Swarup [1 ]
Khan, Latifur [1 ]
Hamlen, Kevin [1 ]
Thuraisingham, Bhavani [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75083 USA
关键词
Threat Report; Security; NLP;
D O I
10.1109/CIC.2018.00040
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With an increase in targeted attacks such as advanced persistent threats (APTs), enterprise system defenders require comprehensive frameworks that allow them to collaborate and evaluate their defense systems against such attacks. MITRE has developed a framework which includes a database of different kill-chains, tactics, techniques, and procedures that attackers employ to perform these attacks. In this work, we leverage natural language processing techniques to extract attacker actions from threat report documents generated by different organizations and automatically classify them into standardized tactics and techniques, while providing relevant mitigation advisories for each attack. A naive method to achieve this is by training a machine learning model to predict labels that associate the reports with relevant categories. In practice, however, sufficient labeled data for model training is not always readily available, so that training and test data come from different sources, resulting in bias. A naive model would typically underperform in such a situation. We address this major challenge by incorporating an importance weighting scheme called bias correction that efficiently utilizes available labeled data, given threat reports, whose categories are to be automatically predicted. We empirically evaluated our approach on 18,257 real-world threat reports generated between year 2000 and 2018 from various computer security organizations to demonstrate its superiority by comparing its performance with an existing approach.
引用
收藏
页码:236 / 245
页数:10
相关论文
共 50 条
  • [1] Classification of Multi-Source Sensor Data with Limited Labeled Data
    Crawford, Melba M.
    Prasad, Saurabh
    Zhou, Xiong
    Zhang, Zhou
    [J]. ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XXI, 2015, 9472
  • [2] Imbalanced Data Classification for Multi-Source Heterogenous Sensor Networks
    Wang, Wei
    Zhang, Mengjun
    Zhang, Li
    Bai, Qiong
    [J]. IEEE ACCESS, 2020, 8 (08): : 27406 - 27413
  • [3] Classification of multi-source data using predictive ability measure
    Chong, CC
    Jia, JC
    [J]. IGARSS '96 - 1996 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM: REMOTE SENSING FOR A SUSTAINABLE FUTURE, VOLS I - IV, 1996, : 180 - 182
  • [4] Forest Types Classification Based on Multi-Source Data Fusion
    Lu, Ming
    Chen, Bin
    Liao, Xiaohan
    Yue, Tianxiang
    Yue, Huanyin
    Ren, Shengming
    Li, Xiaowen
    Nie, Zhen
    Xu, Bing
    [J]. REMOTE SENSING, 2017, 9 (11)
  • [5] Scene Classification Based on Heterogeneous Features of Multi-Source Data
    Xu, Chengjun
    Shu, Jingqian
    Zhu, Guobin
    [J]. REMOTE SENSING, 2023, 15 (02)
  • [6] Multi-Source Insights for Discernment of "Competition" Threat
    Fenstermacher, Laurie
    Larson, Katie
    [J]. SIGNAL PROCESSING, SENSOR/INFORMATION FUSION, AND TARGET RECOGNITION XXIX, 2020, 11423
  • [7] AutoRepair: an automatic repairing approach over multi-source data
    Ye, Chen
    Li, Qi
    Zhang, Hengtong
    Wang, Hongzhi
    Gao, Jing
    Li, Jianzhong
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (01) : 227 - 257
  • [8] A Malware Threat Decision Model Based on Dynamic Multi-Source Data Acquisition
    Sun, Di
    Pang, Jian-min
    Dai, Chao
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER, NETWORK SECURITY AND COMMUNICATION ENGINEERING (CNSCE 2014), 2014, : 21 - 29
  • [9] AutoRepair: an automatic repairing approach over multi-source data
    Chen Ye
    Qi Li
    Hengtong Zhang
    Hongzhi Wang
    Jing Gao
    Jianzhong Li
    [J]. Knowledge and Information Systems, 2019, 61 : 227 - 257
  • [10] Distributed classification in a multi-source environment
    Schuck, TM
    Hunter, JB
    [J]. FUSION 2003: PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE OF INFORMATION FUSION, VOLS 1 AND 2, 2003, : 874 - 880