Efficient anomaly detection in tabular cybersecurity data using large language models

被引:0
|
作者
Xiaoyong Zhao [1 ]
Xingxin Leng [1 ]
Lei Wang [1 ]
Ningning Wang [1 ]
Yanqiong Liu [1 ]
机构
[1] Beijing Information Science and Technology University,
关键词
Anomaly detection; Large language models; Network security; Prompt engineering; Tabular data;
D O I
10.1038/s41598-025-88050-z
中图分类号
学科分类号
摘要
In cybersecurity, anomaly detection in tabular data is essential for ensuring information security. While traditional machine learning and deep learning methods have shown some success, they continue to face significant challenges in terms of generalization. To address these limitations, this paper presents an innovative method for tabular data anomaly detection based on large language models, called “Tabular Anomaly Detection via Guided Prompts” (TAD-GP). This approach utilizes a 7-billion-parameter open-source model and incorporates strategies such as data sample introduction, anomaly type recognition, chain-of-thought reasoning, multi-turn dialogue, and key information reinforcement. Experimental results indicate that the TAD-GP framework improves F1 scores by 79.31%, 97.96%, and 59.09% on the CICIDS2017, KDD Cup 1999, and UNSW-NB15 datasets, respectively. Furthermore, the smaller-scale TAD-GP model outperforms larger models across multiple datasets, demonstrating its practical potential in environments with constrained computational resources and requirements for private deployment. This method addresses a critical gap in research on anomaly detection in cybersecurity, specifically using small-scale open-source models.
引用
收藏
相关论文
共 50 条
  • [41] Data extraction from polymer literature using large language models
    Gupta, Sonakshi
    Mahmood, Akhlak
    Shetty, Pranav
    Adeboye, Aishat
    Ramprasad, Rampi
    Communications Materials, 2024, 5 (01)
  • [42] Implementation of Large Language Models and Agricultural Knowledge Graphs for Efficient Plant Disease Detection
    Zhao, Xinyan
    Chen, Baiyan
    Ji, Mengxue
    Wang, Xinyue
    Yan, Yuhan
    Zhang, Jinming
    Liu, Shiyingjie
    Ye, Muyang
    Lv, Chunli
    AGRICULTURE-BASEL, 2024, 14 (08):
  • [43] Cybersecurity Anomaly Detection: AI and Ethereum Blockchain for a Secure and Tamperproof IoHT Data Management
    Olawale, Oluwaseun Priscilla
    Ebadinezhad, Sahar
    IEEE ACCESS, 2024, 12 : 131605 - 131620
  • [44] An Efficient Anomaly Detection Framework for Electromagnetic Streaming Data
    Sun, Degang
    Hu, Yulan
    Shi, Zhixin
    Xu, Guokun
    Zhou, Wei
    ICBDC 2019: PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON BIG DATA AND COMPUTING, 2019, : 151 - 155
  • [45] Correlated Anomaly Detection from Large Streaming Data
    Chen, Zheng
    Yu, Xinli
    Ling, Yuan
    Song, Bo
    Quan, Wei
    Hu, Xiaohua
    Yan, Erjia
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 982 - 992
  • [46] Explainable and Interpretable Anomaly Detection Models for Production Data
    Alharbi, Basma
    Liang, Zhenwen
    Aljindan, Jana M.
    Agnia, Ammar K.
    Zhang, Xiangliang
    SPE JOURNAL, 2022, 27 (01): : 349 - 363
  • [47] Cyber anomaly detection: Using tabulated vectors and embedded analytics for efficient data mining
    Gutierrez, Robert J.
    Bauer, Kenneth W.
    Boehmke, Bradley C.
    Saie, Cade M.
    Bihl, Trevor J.
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2018, 12 (04) : 293 - 310
  • [48] Demystifying Data Management for Large Language Models
    Miao, Xupeng
    Jia, Zhihao
    Cui, Bin
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 547 - 555
  • [49] Using large language models in psychology
    Demszky, Dorottya
    Yang, Diyi
    Yeager, David
    Bryan, Christopher
    Clapper, Margarett
    Chandhok, Susannah
    Eichstaedt, Johannes
    Hecht, Cameron
    Jamieson, Jeremy
    Johnson, Meghann
    Jones, Michaela
    Krettek-Cobb, Danielle
    Lai, Leslie
    Jonesmitchell, Nirel
    Ong, Desmond
    Dweck, Carol
    Gross, James
    Pennebaker, James
    NATURE REVIEWS PSYCHOLOGY, 2023, 2 (11): : 688 - 701
  • [50] Using large language models in psychology
    Dorottya Demszky
    Diyi Yang
    David S. Yeager
    Christopher J. Bryan
    Margarett Clapper
    Susannah Chandhok
    Johannes C. Eichstaedt
    Cameron Hecht
    Jeremy Jamieson
    Meghann Johnson
    Michaela Jones
    Danielle Krettek-Cobb
    Leslie Lai
    Nirel JonesMitchell
    Desmond C. Ong
    Carol S. Dweck
    James J. Gross
    James W. Pennebaker
    Nature Reviews Psychology, 2023, 2 : 688 - 701