Unstructured Big Data Threat Intelligence Parallel Mining Algorithm

被引:0
|
作者
Li, Zhihua [1 ]
Yu, Xinye [1 ]
Wei, Tao [1 ]
Qian, Junhao [2 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Peoples R China
[2] Jiangnan Univ, Sch IoT Engn, Wuxi 214122, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 02期
关键词
unstructured big data mining; parallel deep forest; multi-label classification algorithm; threat intelligence;
D O I
10.26599/BDMA.2023.9020032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To efficiently mine threat intelligence from the vast array of open-source cybersecurity analysis reports on the web, we have developed the Parallel Deep Forest-based Multi-Label Classification (PDFMLC) algorithm. Initially, open-source cybersecurity analysis reports are collected and converted into a standardized text format. Subsequently, five tactics category labels are annotated, creating a multi-label dataset for tactics classification. Addressing the limitations of low execution efficiency and scalability in the sequential deep forest algorithm, our PDFMLC algorithm employs broadcast variables and the Lempel-Ziv-Welch (LZW) algorithm, significantly enhancing its acceleration ratio. Furthermore, our proposed PDFMLC algorithm incorporates label mutual information from the established dataset as input features. This captures latent label associations, significantly improving classification accuracy. Finally, we present the PDFMLC-based Threat Intelligence Mining (PDFMLC-TIM) method. Experimental results demonstrate that the PDFMLC algorithm exhibits exceptional node scalability and execution efficiency. Simultaneously, the PDFMLC-TIM method proficiently conducts text classification on cybersecurity analysis reports, extracting tactics entities to construct comprehensive threat intelligence. As a result, successfully formatted STIX2.1 threat intelligence is established.
引用
收藏
页码:531 / 546
页数:16
相关论文
共 50 条
  • [1] TIM: threat context-enhanced TTP intelligence mining on unstructured threat data
    You, Yizhe
    Jiang, Jun
    Jiang, Zhengwei
    Yang, Peian
    Liu, Baoxu
    Feng, Huamin
    Wang, Xuren
    Li, Ning
    [J]. CYBERSECURITY, 2022, 5 (01)
  • [2] TIM: threat context-enhanced TTP intelligence mining on unstructured threat data
    Yizhe You
    Jun Jiang
    Zhengwei Jiang
    Peian Yang
    Baoxu Liu
    Huamin Feng
    Xuren Wang
    Ning Li
    [J]. Cybersecurity, 5
  • [3] MRPrePost-A parallel algorithm adapted for mining big data
    Liao, Jinggui
    Zhao, Yuelong
    Long, Saiqin
    [J]. 2014 IEEE WORKSHOP ON ELECTRONICS, COMPUTER AND APPLICATIONS, 2014, : 564 - 568
  • [4] Mining Unstructured Data via Computational Intelligence
    Kuri-Morales, Angel
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, MICAI 2015, PT I, 2015, 9413 : 518 - 529
  • [5] Research on the data mining algorithm of computer big data and artificial intelligence integration
    Xue, Xianghong
    Shu, Zhenqiu
    Xue, Xiaofeng
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 97 - 97
  • [6] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    [J]. IAENG International Journal of Applied Mathematics, 2023, 53 (01):
  • [7] Intelligence analysis algorithm based on improved data mining under big data environment
    Zheng, Fuyan
    Zheng, Baomin
    Han, Xue
    [J]. Agro Food Industry Hi-Tech, 2017, 28 (03): : 3096 - 3099
  • [8] Intelligence Analysis Algorithm Based on Improved Data Mining under Big Data Environment
    Zheng Fuyan
    Zheng Baomin
    Han Xue
    [J]. AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 3096 - 3099
  • [9] Mining unstructured data for a competitive intelligence system XEW
    El Haddadi, Amine
    Fennan, Abdelhadi
    El Haddadi, Anass
    Boulouard, Zakaria
    Koutti, Lahcen
    [J]. 2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND ECONOMIC INTELLIGENCE (SIIE), 2015, : 146 - 149
  • [10] Big Data Clustering Mining Based on Swarm Intelligence Algorithm in Cloud Environment
    Yan, Yaning
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022