Enhancing Cyber Threat Intelligence with Named Entity Recognition using BERT-CRF

被引:1
|
作者
Chen, Sheng-Shan [1 ]
Hwang, Ren-Hung [2 ]
Sun, Chin-Yu [1 ]
Lin, Ying-Dar [3 ]
Pai, Tun-Wen [1 ]
机构
[1] Natl Taipei Univ Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Coll Artificial Intelligence, Tainan, Taiwan
[3] Natl Yang Ming Chiao Tung Univ, Dept Comp, Hsinchu, Taiwan
关键词
cyber threat intelligence; deep learning; cyber security;
D O I
10.1109/GLOBECOM54140.2023.10436853
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cyber Threat Intelligence (CTI) helps organizations understand the tactics, techniques, and procedures used by potential cyber criminals to defend against cyber threats. To protect the core systems and services of organizations, security analysts must analyze information about threats and vulnerabilities. However, analyzing large amounts of data requires significant time and effort. To streamline this process, we propose an enhanced architecture, BERT-CRF, by removing the BiLSTM layer from the conventional BERT-BiLSTM-CRF model. This model leverages the strengths of deep learning-based language models to extract critical threat intelligence and novel information from threats effectively. In our BERT-CRF model, the token embeddings generated by BERT are directly fed into the Conditional Random Field (CRF) layer for efficient Named Entity Recognition (NER), thus preventing the need for an intermediate BiLSTM layer. We train and evaluate the model with three publicly available threat entity databases. We also collect open-source threat intelligence data from recent years for evaluating the applicability of the constructed model in a real-world environment. Furthermore, we compare our model with the most popular GPT-3.5 and the most downloaded open-source BERT question-and-answer models. Through this study, our proposed model demonstrated robust usability and outperformed other models, signifying its potential for application in CTI. In a real-world scenario, our model achieved an accuracy of 82.64%, while with malware-specific threat intelligence data, it achieved an impressive accuracy of 93.95%. The code for this research is publicly available at https://github.com/stwater20/ner bert crf open version.
引用
下载
收藏
页码:7532 / 7537
页数:6
相关论文
共 50 条
  • [1] Chinese agricultural diseases named entity recognition based on BERT-CRF
    Zhang, Suoxiang
    Zhao, Ming
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1144 - 1147
  • [2] Named Entity Recognition of Fresh Egg Supply Chain Based on BERT-CRF Architecture
    Liu X.
    Zhang M.
    Gu Q.
    Ren Y.
    He D.
    Gao W.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 : 519 - 525
  • [3] An Effective Approach of Named Entity Recognition for Cyber Threat Intelligence
    Wu, Han
    Li, Xiaoyong
    Gao, Yali
    PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1370 - 1374
  • [4] Chinese Cyber Threat Intelligence Named Entity Recognition via RoBERTa-wwm-RDCNN-CRF
    Zhen, Zhen
    Gao, Jian
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (01): : 299 - 323
  • [5] Named Entity Recognition in Cyber Threat Intelligence Using Transformer-based Models
    Evangelatos, Pavlos
    Iliou, Christos
    Mavropoulos, Thanassis
    Apostolou, Konstantinos
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 348 - 353
  • [6] Telugu named entity recognition using bert
    Gorla, SaiKiranmai
    Tangeda, Sai Sharan
    Neti, Lalita Bhanu Murthy
    Malapati, Aruna
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2022, 14 (02) : 127 - 140
  • [7] Telugu named entity recognition using bert
    SaiKiranmai Gorla
    Sai Sharan Tangeda
    Lalita Bhanu Murthy Neti
    Aruna Malapati
    International Journal of Data Science and Analytics, 2022, 14 : 127 - 140
  • [8] Named entity recognition in thangka field based on bert-bilstm-crf-a
    Guo, Xiaoran
    Cheng, Sujie
    Wang, Weilan
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2021, 83 (01): : 161 - 174
  • [9] NAMED ENTITY RECOGNITION IN THANGKA FIELD BASED ON BERT-BiLSTM-CRF-a
    Guo, Xiaoran
    Cheng, Sujie
    Wang, Weilan
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2021, 83 (01): : 161 - 174
  • [10] Geotechnical Named Entity Recognition Based on BERT-BiGRU-CRF Model
    Quanyu W.
    Li Z.
    Tu Z.
    Chen G.
    Hu J.
    Chen J.
    Chen J.
    Lv G.
    Diqiu Kexue - Zhongguo Dizhi Daxue Xuebao/Earth Science - Journal of China University of Geosciences, 2023, 48 (08): : 3137 - 3150