LogRep: Log-based Anomaly Detection by Representing both Semantic and Numeric Information in Raw Messages

被引:0
|
作者
Xie, Xiaoda [1 ]
Jiang, Songlei [1 ]
Huang, Chenlin [1 ]
Yu, Fengyuan [1 ]
Deng, Yunjia [2 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
[2] China United Network Commun Co Ltd, Hunan Branch, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Log-based anomaly detection; Log representation learning; Limited training data; Log heterogeneity; Log data analysis; LARGE-SCALE;
D O I
10.1109/ISSRE59848.2023.00015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Log-based anomaly detection plays an essential role in various system reliability-related fields including software reliability, network reliability, and so on. System log data is a kind of semi-structured heterogeneous data that contains both semantic parts and numeric variables which both reflect the abnormal behavior of the system. However, existing log-based anomaly detection methods fail to capture the numeric information in raw data which makes them degrade a lot when only limited labeled data is available. To comprehensively capture the semantic and numeric information to enhance anomaly detection, we propose LogRep, a novel representation-based log anomaly detection method that captures both semantic and numeric information in the learned representations. The newly proposed position-aware numeric representation learning module and the attention-based representation fusion module in LogRep solve the heterogeneity problem well in log data. Due to the high quality of learned log representation, LogRep can achieve a comparable anomaly detection performance with SOTA methods while the training data used in LogRep is two orders of magnitude less than that used in SOTA methods. When reducing the training data scale, the performance of SOTA methods drops a lot, while LogRep keeps a stable good performance on two public HDFS dataset, BGL dataset, and one self-collected dataset. Specifically, LogRep achieves the 10.6% and 5.8% improvements over the second-best method in terms of F1 score on the BGL and HDFS datasets when only 1% training data are available respectively.
引用
下载
收藏
页码:195 / 206
页数:12
相关论文
共 50 条
  • [41] PLELog: Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation
    Yang, Lin
    Chen, Junjie
    Wang, Zan
    Wang, Weijing
    Jiang, Jiajun
    Dong, Xuyuan
    Zhang, Wenbin
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 230 - 231
  • [42] Log-based anomaly detection for distributed systems: State of the art, industry experience, and open issues
    Wei, Xinjie
    Wang, Jie
    Sun, Chang-ai
    Towey, Dave
    Zhang, Shoufeng
    Zuo, Wanqing
    Yu, Yiming
    Ruan, Ruoyi
    Song, Guyang
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (08)
  • [43] Early Exploration of Using ChatGPT for Log-based Anomaly Detection on Parallel File Systems Logs
    Egersdoerfer, Chris
    Zhang, Di
    Dai, Dong
    PROCEEDINGS OF THE 32ND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2023, 2023, : 315 - 316
  • [44] SpikeLog: Log-based anomaly detection via Potential-assisted Spiking Neuron Network
    Qi J.
    Luan Z.
    Huang S.
    Fung C.
    Yang H.
    Qian D.
    IEEE Transactions on Knowledge and Data Engineering, 2024, 36 (12) : 1 - 15
  • [45] Sprelog: Log-Based Anomaly Detection with Self-matching Networks and Pre-trained Models
    Yang, Haitian
    Zhao, Xuan
    Sun, Degang
    Wang, Yan
    Huang, Weiqing
    SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 736 - 743
  • [46] Log-Based Anomaly Detection with Multi-Head Scaled Dot-Product Attention Mechanism
    Du, Qingfeng
    Zhao, Liang
    Xu, Jincheng
    Han, Yongqi
    Zhang, Shuangli
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2021, PT I, 2021, 12923 : 335 - 347
  • [47] Drill: Log-based Anomaly Detection for Large-scale Storage Systems Using Source Code Analysis
    Zhang, Di
    Egersdoerfer, Chris
    Mahmud, Tabassum
    Zheng, Mai
    Dai, Dong
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, 2023, : 189 - 199
  • [48] LogBASA: Log Anomaly Detection Based on System Behavior Analysis and Global Semantic Awareness
    Liao, Liping
    Zhu, Ke
    Luo, Jianzhen
    Cai, Jun
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2023, 2023
  • [49] Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms
    Kim, Czangyeob
    Jang, Myeongjun
    Seo, Seungwan
    Park, Kyeongchan
    Kang, Pilsung
    IEEE ACCESS, 2021, 9 : 58088 - 58101
  • [50] MLog: Mogrifier LSTM-Based Log Anomaly Detection Approach Using Semantic Representation
    Fu, Yuanyuan
    Liang, Kun
    Xu, Jian
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (05) : 3537 - 3549