AECR: Automatic attack technique intelligence extraction based on fine-tuned large language model

被引:0
|
作者
机构
[1] [1,Chen, Minghao
[2] 3,Zhu, Kaijie
[3] 1,Lu, Bin
[4] 1,Li, Ding
[5] 1,Yuan, Qingjun
[6] 1,Zhu, Yuefei
来源
关键词
Cyber attacks;
D O I
10.1016/j.cose.2024.104213
中图分类号
学科分类号
摘要
Cyber Threat Intelligence (CTI) reports contain resourceful intelligence on cyber-attack campaigns, which provides great help for security analysts to infer attack trends and enhance their defenses. However, due to the diversity of report content and writing styles, current intelligence extraction is mostly based on time-consuming manual efforts. Moreover, existing automatic methods generally neglect the importance of background knowledge and produce inexact extraction results. These problems prevent the effective utilization and sharing of intelligence from CTI reports. In this paper, we primarily focus on the automatic extraction of attack technique (AT) intelligence, which reveals patterns of attack behaviors and hardly changes over time. We propose a novel automatic AT extraction pipeline for CTI reports (AECR). AECR explores the feasibility of extracting AT intelligence based on a fined-tuned large language model (LLM). Particularly, we endow the selected LLM with enhanced domain-specific knowledge to improve its comprehension of AT-relevant content and alleviate the hallucination problem. Experimental results demonstrate that AECR outperforms state-of-the-art methods by a wide margin with a reasonable time cost. Specifically, we improve the accuracy, precision, recall, and F1-score by 108%, 37.2%, 22.4%, and 67.5% respectively. To the best of our knowledge, AECR is the first to perform AT extraction based on fine-tuned LLM. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [1] CentralBankRoBERTa: A fine-tuned large language model for central bank communications☆
    Pfeifer, Moritz
    Marohl, Vincent P.
    JOURNAL OF FINANCE AND DATA SCIENCE, 2023, 9
  • [2] Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks
    Luo, Ling
    Ning, Jinzhong
    Zhao, Yingwen
    Wang, Zhijun
    Ding, Zeyuan
    Chen, Peng
    Fu, Weiru
    Han, Qinyu
    Xu, Guangtao
    Qiu, Yunzhi
    Pan, Dinghao
    Li, Jiru
    Li, Hao
    Feng, Wenduo
    Tu, Senbo
    Liu, Yuqi
    Yang, Zhihao
    Wang, Jian
    Sun, Yuanyuan
    Lin, Hongfei
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1865 - 1874
  • [3] EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with Epilepsy Medical Knowledge
    Zhao, Xuyang
    Zhao, Qibin
    Tanaka, Toshihisa
    arXiv,
  • [4] A fine-tuned large language model based molecular dynamics agent for code generation to obtain material thermodynamic parameters
    Zhuofan Shi
    Chunxiao Xin
    Tong Huo
    Yuntao Jiang
    Bowen Wu
    Xingyue Chen
    Wei Qin
    Xinjian Ma
    Gang Huang
    Zhenyu Wang
    Xiang Jing
    Scientific Reports, 15 (1)
  • [5] Website Category Classification Using Fine-tuned BERT Language Model
    Demirkiran, Ferhat
    Cayir, Aykut
    Unal, Ugur
    Dag, Hasan
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2020, : 333 - 336
  • [6] Arabic sarcasm detection: An enhanced fine-tuned language model approach
    Galal, Mohamed A.
    Yousef, Ahmed Hassan
    Zayed, Hala H.
    Medhat, Walaa
    AIN SHAMS ENGINEERING JOURNAL, 2024, 15 (06)
  • [7] Extracting structured data from organic synthesis procedures using a fine-tuned large language model
    Ai, Qianxiang
    Meng, Fanwang
    Shi, Jiale
    Pelkie, Brenden
    Coley, Connor W.
    DIGITAL DISCOVERY, 2024, 3 (09): : 1822 - 1831
  • [8] The Fine-Tuned Large Language Model for Extracting the Progressive Bone Metastasis from Unstructured Radiology Reports
    Kanemaru, Noriko
    Yasaka, Koichiro
    Fujita, Nana
    Kanzawa, Jun
    Abe, Osamu
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, : 865 - 872
  • [9] Automatic Component Prediction for Issue Reports Using Fine-Tuned Pretrained Language Models
    Wang, Dae-Sung
    Lee, Chan-Gun
    IEEE ACCESS, 2022, 10 : 131456 - 131468
  • [10] Fine-Tuned BERT Model for Large Scale and Cognitive Classification of MOOCs
    Sebbaq, Hanane
    El Faddouli, Nour-eddine
    INTERNATIONAL REVIEW OF RESEARCH IN OPEN AND DISTRIBUTED LEARNING, 2022, 23 (02): : 170 - 190