Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers

被引：5

作者：

Huang, Shaohan ^{[1
]}

Liu, Yi ^{[1
]}

Fung, Carol ^{[2
]}

Wang, He ^{[1
]}

Yang, Hailong ^{[1
]}

Luan, Zhongzhi ^{[1
]}

机构：

[1] Beihang Univ, Sino German Joint Software Inst, Beijing 100191, Peoples R China

[2] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2023年 / 72卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Log-based anomaly detection; pre-training; hierarchical transformers; robustness;

D O I：

10.1109/TC.2023.3257518

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Pre-trained models, such as BERT, have resulted in significant pre-trained models, such as BERT, have resulted in significant improvements in many natural language processing (NLP) applications. However, due to differences in word distribution and domain data distribution, applying NLP advancements to log analysis directly faces some performance challenges. This paper studies how to adapt the recently introduced pre-trained language model BERT for log analysis. In this work, we propose a pre-trained log representation model with hierarchical bidirectional encoder transformers (namely, HilBERT). Unlike previous work, which used raw text as pre-training data, we parse logs into templates before using the log templates to pre-train HilBERT. We also design a hierarchical transformers model to capture log template sequence-level information. We use log-based anomaly detection for downstream tasks and fine-tune our model with different log data. Our experiments demonstrate that HilBERT outperforms other baseline techniques on unstable log data. While BERT obtains performance comparable to that of previous state-of-the-art models, HilBERT can significantly address the problem of log instability and achieve accurate and robust results.

引用

页码：2656 / 2667

页数：12

共 50 条

[31] TADPOLE: Task ADapted Pre-training via anOmaLy dEtection
Madan, Vivek
Khetan, Ashish
Karnin, Zohar
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5732 - 5746
[32] Pre-Training Transformers as Energy-Based Cloze Models
Clark, Kevin
Luong, Minh-Thang
Le, Quoc V.
Manning, Christopher D.
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 285 - 294
[33] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
Dai, Zhigang
Cai, Bolun
Lin, Yugeng
Chen, Junying
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1601 - 1610
[34] Diffusion-based normality pre-training for weakly supervised video anomaly detection
Basak, Suvramalya
Gautam, Anjali
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
[35] Improving Fractal Pre-training
Anderson, Connor
Farrell, Ryan
[J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2412 - 2421
[36] Evaluation of FractalDB Pre-training with Vision Transformers
Nakashima K.
Kataoka H.
Satoh Y.
[J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2023, 89 (01): : 99 - 104
[37] AFALog: A General Augmentation Framework for Log-based Anomaly Detection with Active Learning
Duan, Chiming
Jia, Tong
Cai, Huaqian
Li, Ying
Huang, Gang
[J]. 2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 46 - 56
[38] DualAttlog: Context aware dual attention networks for log-based anomaly detection
Yang, Haitian
Sun, Degang
Huang, Weiqing
[J]. NEURAL NETWORKS, 2024, 180
[39] Temporal Logical Attention Network for Log-Based Anomaly Detection in Distributed Systems
Liu, Yang
Ren, Shaochen
Wang, Xuran
Zhou, Mengjie
[J]. Sensors, 2024, 24 (24)
[40] Hilogx: noise-aware log-based anomaly detection with human feedback
Tong Jia
Ying Li
Yong Yang
Gang Huang
[J]. The VLDB Journal, 2024, 33 : 883 - 900

← 1 2 3 4 5 →