Enhancing Cyber Threat Intelligence with Named Entity Recognition using BERT-CRF

被引:1
|
作者
Chen, Sheng-Shan [1 ]
Hwang, Ren-Hung [2 ]
Sun, Chin-Yu [1 ]
Lin, Ying-Dar [3 ]
Pai, Tun-Wen [1 ]
机构
[1] Natl Taipei Univ Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Coll Artificial Intelligence, Tainan, Taiwan
[3] Natl Yang Ming Chiao Tung Univ, Dept Comp, Hsinchu, Taiwan
关键词
cyber threat intelligence; deep learning; cyber security;
D O I
10.1109/GLOBECOM54140.2023.10436853
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cyber Threat Intelligence (CTI) helps organizations understand the tactics, techniques, and procedures used by potential cyber criminals to defend against cyber threats. To protect the core systems and services of organizations, security analysts must analyze information about threats and vulnerabilities. However, analyzing large amounts of data requires significant time and effort. To streamline this process, we propose an enhanced architecture, BERT-CRF, by removing the BiLSTM layer from the conventional BERT-BiLSTM-CRF model. This model leverages the strengths of deep learning-based language models to extract critical threat intelligence and novel information from threats effectively. In our BERT-CRF model, the token embeddings generated by BERT are directly fed into the Conditional Random Field (CRF) layer for efficient Named Entity Recognition (NER), thus preventing the need for an intermediate BiLSTM layer. We train and evaluate the model with three publicly available threat entity databases. We also collect open-source threat intelligence data from recent years for evaluating the applicability of the constructed model in a real-world environment. Furthermore, we compare our model with the most popular GPT-3.5 and the most downloaded open-source BERT question-and-answer models. Through this study, our proposed model demonstrated robust usability and outperformed other models, signifying its potential for application in CTI. In a real-world scenario, our model achieved an accuracy of 82.64%, while with malware-specific threat intelligence data, it achieved an impressive accuracy of 93.95%. The code for this research is publicly available at https://github.com/stwater20/ner bert crf open version.
引用
下载
收藏
页码:7532 / 7537
页数:6
相关论文
共 50 条
  • [21] Named Entity Recognition by Using XLNet-BiLSTM-CRF
    Yan, Rongen
    Jiang, Xue
    Dang, Depeng
    NEURAL PROCESSING LETTERS, 2021, 53 (05) : 3339 - 3356
  • [22] Portuguese Named Entity Recognition Using LSTM-CRF
    Quinta de Castro, Pedro Vitor
    Felipe da Silva, Nadia Felix
    Soares, Anderson da Silva
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 83 - 92
  • [23] Enhancing Legal Named Entity Recognition Using RoBERTa-GCN with CRF: A Nuanced Approach for Fine-Grained Entity Recognition
    Jain, Arihant
    Sharma, Raksha
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 261 - 267
  • [24] Named Entity Recognition for Long COVID Biomedical Literature by Using Bert-BiLSTM-IDCNN-ATT-CRF Approach
    Han, Zongwang
    Lin, Shaofu
    Huang, Zhisheng
    Guo, Chaohui
    PROCEEDINGS OF 2023 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE FOR MEDICINE SCIENCE, ISAIMS 2023, 2023, : 1200 - 1205
  • [25] A Deep Learning Based Approach for Biomedical Named Entity Recognition Using Multitasking Transfer Learning with BiLSTM, BERT and CRF
    Pooja H.
    Jagadeesh M.P.P.
    SN Computer Science, 5 (5)
  • [26] Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT
    Jarrar, Mustafa
    Khalilia, Mohammed
    Ghanem, Sana
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3626 - 3636
  • [27] A Multi-Task BERT-BiLSTM-AM-CRF Strategy for Chinese Named Entity Recognition
    Xiaoyong Tang
    Yong Huang
    Meng Xia
    Chengfeng Long
    Neural Processing Letters, 2023, 55 : 1209 - 1229
  • [28] A Multi-Task BERT-BiLSTM-AM-CRF Strategy for Chinese Named Entity Recognition
    Tang, Xiaoyong
    Huang, Yong
    Xia, Meng
    Long, Chengfeng
    NEURAL PROCESSING LETTERS, 2023, 55 (02) : 1209 - 1229
  • [29] Recognition of Animal Drug Pathogenicity Named Entity Based on Att-Aux-BERT-BiLSTM-CRF
    Yang L.
    Zhang T.
    Zheng L.
    Tian L.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2022, 53 (03): : 294 - 300
  • [30] DNRTI: A Large-scale Dataset for Named Entity Recognition in Threat Intelligence
    Wang, Xuren
    Liu, Xinpei
    Ao, Shengqin
    Li, Ning
    Jiang, Zhengwei
    Xu, Zongyi
    Xiong, Zihan
    Xiong, Mengbo
    Zhang, Xiaoqing
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1842 - 1848