Enhancing Cyber Threat Intelligence with Named Entity Recognition using BERT-CRF

被引:1
|
作者
Chen, Sheng-Shan [1 ]
Hwang, Ren-Hung [2 ]
Sun, Chin-Yu [1 ]
Lin, Ying-Dar [3 ]
Pai, Tun-Wen [1 ]
机构
[1] Natl Taipei Univ Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Coll Artificial Intelligence, Tainan, Taiwan
[3] Natl Yang Ming Chiao Tung Univ, Dept Comp, Hsinchu, Taiwan
关键词
cyber threat intelligence; deep learning; cyber security;
D O I
10.1109/GLOBECOM54140.2023.10436853
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cyber Threat Intelligence (CTI) helps organizations understand the tactics, techniques, and procedures used by potential cyber criminals to defend against cyber threats. To protect the core systems and services of organizations, security analysts must analyze information about threats and vulnerabilities. However, analyzing large amounts of data requires significant time and effort. To streamline this process, we propose an enhanced architecture, BERT-CRF, by removing the BiLSTM layer from the conventional BERT-BiLSTM-CRF model. This model leverages the strengths of deep learning-based language models to extract critical threat intelligence and novel information from threats effectively. In our BERT-CRF model, the token embeddings generated by BERT are directly fed into the Conditional Random Field (CRF) layer for efficient Named Entity Recognition (NER), thus preventing the need for an intermediate BiLSTM layer. We train and evaluate the model with three publicly available threat entity databases. We also collect open-source threat intelligence data from recent years for evaluating the applicability of the constructed model in a real-world environment. Furthermore, we compare our model with the most popular GPT-3.5 and the most downloaded open-source BERT question-and-answer models. Through this study, our proposed model demonstrated robust usability and outperformed other models, signifying its potential for application in CTI. In a real-world scenario, our model achieved an accuracy of 82.64%, while with malware-specific threat intelligence data, it achieved an impressive accuracy of 93.95%. The code for this research is publicly available at https://github.com/stwater20/ner bert crf open version.
引用
收藏
页码:7532 / 7537
页数:6
相关论文
共 50 条
  • [41] Chinese mineral named entity recognition based on BERT model
    Yu, Yuqing
    Wang, Yuzhu
    Mua, Jingqin
    Li, Wei
    Jiao, Shoutao
    Wang, Zhenhua
    Lv, Pengfei
    Zhu, Yueqin
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 206
  • [42] Chinese Named Entity Recognition in the Geoscience Domain Based on BERT
    Lv, Xia
    Xie, Zhong
    Xu, Dexin
    Jin, Xiangguo
    Ma, Kai
    Tao, Liufeng
    Qiu, Qinjun
    Pan, Yongsheng
    EARTH AND SPACE SCIENCE, 2022, 9 (03)
  • [43] Named Entity Recognition in Aviation Products Domain Based on BERT
    Yang, Mingye
    Namoano, Bernadin
    Farsi, Maryam
    Ahmet Erkoyuncu, John
    IEEE Access, 2024, 12 : 189710 - 189721
  • [44] Arabic Named Entity Recognition: A BERT-BGRU Approach
    Alsaaran, Norah
    Alrabiah, Maha
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (01): : 471 - 485
  • [45] Named Entity Recognition of Enterprise Annual Report Integrated with BERT
    Zhang J.
    He G.
    Dai Z.
    Liu Y.
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2021, 55 (02): : 117 - 123
  • [46] Based on BERT-wwm for Agricultural Named Entity Recognition
    Huang, Qiang
    Tao, Youzhi
    Wu, Zongyuan
    Marinello, Francesco
    AGRONOMY-BASEL, 2024, 14 (06):
  • [47] UD_BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning
    Li, Wei
    Du, Yajun
    Li, Xianyong
    Chen, Xiaoliang
    Xie, Chunzhi
    Li, Hui
    Li, Xiaolei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
  • [48] UD_BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning
    Li, Wei
    Du, Yajun
    Li, Xianyong
    Chen, Xiaoliang
    Xie, Chunzhi
    Li, Hui
    Li, Xiaolei
    Engineering Applications of Artificial Intelligence, 2022,
  • [49] Named Entity Recognition in the Medical Domain with Constrained CRF Models
    Jochim, Charles
    Deleris, Lea A.
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 839 - 849
  • [50] BiLSTM-CRF for Persian Named-Entity Recognition
    Poostchi, Hanieh
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4427 - 4431