LogRep: Log-based Anomaly Detection by Representing both Semantic and Numeric Information in Raw Messages

被引:0
|
作者
Xie, Xiaoda [1 ]
Jiang, Songlei [1 ]
Huang, Chenlin [1 ]
Yu, Fengyuan [1 ]
Deng, Yunjia [2 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
[2] China United Network Commun Co Ltd, Hunan Branch, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Log-based anomaly detection; Log representation learning; Limited training data; Log heterogeneity; Log data analysis; LARGE-SCALE;
D O I
10.1109/ISSRE59848.2023.00015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Log-based anomaly detection plays an essential role in various system reliability-related fields including software reliability, network reliability, and so on. System log data is a kind of semi-structured heterogeneous data that contains both semantic parts and numeric variables which both reflect the abnormal behavior of the system. However, existing log-based anomaly detection methods fail to capture the numeric information in raw data which makes them degrade a lot when only limited labeled data is available. To comprehensively capture the semantic and numeric information to enhance anomaly detection, we propose LogRep, a novel representation-based log anomaly detection method that captures both semantic and numeric information in the learned representations. The newly proposed position-aware numeric representation learning module and the attention-based representation fusion module in LogRep solve the heterogeneity problem well in log data. Due to the high quality of learned log representation, LogRep can achieve a comparable anomaly detection performance with SOTA methods while the training data used in LogRep is two orders of magnitude less than that used in SOTA methods. When reducing the training data scale, the performance of SOTA methods drops a lot, while LogRep keeps a stable good performance on two public HDFS dataset, BGL dataset, and one self-collected dataset. Specifically, LogRep achieves the 10.6% and 5.8% improvements over the second-best method in terms of F1 score on the BGL and HDFS datasets when only 1% training data are available respectively.
引用
下载
收藏
页码:195 / 206
页数:12
相关论文
共 50 条
  • [31] DualAttlog: Context aware dual attention networks for log-based anomaly detection
    Yang, Haitian
    Sun, Degang
    Huang, Weiqing
    NEURAL NETWORKS, 2024, 180
  • [32] Temporal Logical Attention Network for Log-Based Anomaly Detection in Distributed Systems
    Liu, Yang
    Ren, Shaochen
    Wang, Xuran
    Zhou, Mengjie
    Sensors, 2024, 24 (24)
  • [33] Hilogx: noise-aware log-based anomaly detection with human feedback
    Jia, Tong
    Li, Ying
    Yang, Yong
    Huang, Gang
    VLDB JOURNAL, 2024, 33 (03): : 883 - 900
  • [34] Log-based Anomaly Detection from Multi-view by Associating Anomaly Scores with User Trust
    Wang, Lin
    Zhang, Kun
    Li, Chen
    Tu, Bibo
    2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 643 - 650
  • [35] Unsupervised Learning and Online Anomaly Detection: An On-Condition Log-Based Maintenance System
    Decker, Leticia
    Leite, Daniel
    Minarini, Francesco
    Tisbeni, Simone Rossi
    Bonacorsi, Daniele
    INTERNATIONAL JOURNAL OF EMBEDDED AND REAL-TIME COMMUNICATION SYSTEMS (IJERTCS), 2022, 13 (01):
  • [36] Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation
    Yang, Lin
    Chen, Junjie
    Wang, Zan
    Wang, Weijing
    Jiang, Jiajun
    Dong, Xuyuan
    Zhang, Wenbin
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 1448 - 1460
  • [37] Log Anomaly Detection Based on Semantic Features and Topic Features
    Wang, Peipeng
    Zhang, Xiuguo
    Cao, Zhiying
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT V, 2024, 14491 : 407 - 427
  • [38] Augmenting Log-based Anomaly Detection Models to Reduce False Anomalies with Human Feedback
    Jia, Tong
    Li, Ying
    Yang, Yong
    Huang, Gang
    Wu, Zhonghai
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 3081 - 3089
  • [39] Try with Simpler-An Evaluation of Improved Principal Component Analysis in Log-based Anomaly Detection
    Yang, Lin
    Chen, Junjie
    Gao, Shutao
    Gong, Zhihao
    Zhang, Hongyu
    Kang, Yue
    Li, Huaan
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
  • [40] A Log-Based Anomaly Detection Method with Efficient Neighbor Searching and Automatic K Neighbor Selection
    Wang, Bingming
    Ying, Shi
    Yang, Zhe
    SCIENTIFIC PROGRAMMING, 2020, 2020 (2020)