Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers

被引:5
|
作者
Huang, Shaohan [1 ]
Liu, Yi [1 ]
Fung, Carol [2 ]
Wang, He [1 ]
Yang, Hailong [1 ]
Luan, Zhongzhi [1 ]
机构
[1] Beihang Univ, Sino German Joint Software Inst, Beijing 100191, Peoples R China
[2] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
基金
中国国家自然科学基金;
关键词
Log-based anomaly detection; pre-training; hierarchical transformers; robustness;
D O I
10.1109/TC.2023.3257518
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Pre-trained models, such as BERT, have resulted in significant pre-trained models, such as BERT, have resulted in significant improvements in many natural language processing (NLP) applications. However, due to differences in word distribution and domain data distribution, applying NLP advancements to log analysis directly faces some performance challenges. This paper studies how to adapt the recently introduced pre-trained language model BERT for log analysis. In this work, we propose a pre-trained log representation model with hierarchical bidirectional encoder transformers (namely, HilBERT). Unlike previous work, which used raw text as pre-training data, we parse logs into templates before using the log templates to pre-train HilBERT. We also design a hierarchical transformers model to capture log template sequence-level information. We use log-based anomaly detection for downstream tasks and fine-tune our model with different log data. Our experiments demonstrate that HilBERT outperforms other baseline techniques on unstable log data. While BERT obtains performance comparable to that of previous state-of-the-art models, HilBERT can significantly address the problem of log instability and achieve accurate and robust results.
引用
收藏
页码:2656 / 2667
页数:12
相关论文
共 50 条
  • [21] ClusterLog: Clustering Logs for Effective Log-based Anomaly Detection
    Egersdoerfer, Chris
    Zhang, Di
    Dai, Dong
    [J]. 2022 IEEE/ACM 12TH WORKSHOP ON FAULT TOLERANCE FOR HPC AT EXTREME SCALE (FTXS), 2022, : 1 - 10
  • [22] LogEncoder: Log-Based Contrastive Representation Learning for Anomaly Detection
    Qi, Jiaxing
    Luan, Zhongzhi
    Huang, Shaohan
    Fung, Carol
    Yang, Hailong
    Li, Hanlu
    Zhu, Danfeng
    Qian, Depei
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (02): : 1378 - 1391
  • [23] Black-box Attacks to Log-based Anomaly Detection
    Huang, Shaohan
    Liu, Yi
    Fung, Carol
    Yang, Hailong
    Luan, Zhongzhi
    [J]. 2022 18TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM 2022): INTELLIGENT MANAGEMENT OF DISRUPTIVE NETWORK TECHNOLOGIES AND SERVICES, 2022, : 310 - 316
  • [24] Log-Based Anomaly Detection With Robust Feature Extraction and Online Learning
    Han, Shangbin
    Wu, Qianhong
    Zhang, Han
    Qin, Bo
    Hu, Jiankun
    Shi, Xingang
    Liu, Linfeng
    Yin, Xia
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 2300 - 2311
  • [25] A robust Wide & Deep learning framework for log-based anomaly detection
    Niu, Weina
    Liao, Xuhan
    Huang, Shiping
    Li, Yudong
    Zhang, Xiaosong
    Li, Beibei
    [J]. APPLIED SOFT COMPUTING, 2024, 153
  • [26] Log-Based Anomaly Detection with the Improved K-Nearest Neighbor
    Wang, Bingming
    Ying, Shi
    Cheng, Guoli
    Wang, Rui
    Yang, Zhe
    Dong, Bo
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (02) : 239 - 262
  • [27] Sprelog: Log-Based Anomaly Detection with Self-matching Networks and Pre-trained Models
    Yang, Haitian
    Zhao, Xuan
    Sun, Degang
    Wang, Yan
    Huang, Weiqing
    [J]. SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 736 - 743
  • [28] HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
    Zhang, Xingxing
    Wei, Furu
    Zhou, Ming
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5059 - 5069
  • [29] LogCAD: An Efficient and Robust Model for Log-Based Conformal Anomaly Detection
    Liu, Chunbo
    Liang, Mengmeng
    Hou, Jingwen
    Gu, Zhaojun
    Wang, Zhi
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [30] Log-based Anomaly Detection with Deep Learning: How Far Are We?
    Le, Van-Hoang
    Zhang, Hongyu
    [J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1356 - 1367