On the effectiveness of log representation for log-based anomaly detection

被引:0
|
作者
Xingfang Wu
Heng Li
Foutse Khomh
机构
[1] Polytechnique Montreal,Department of Computer Engineering and Software Engineering
来源
关键词
Log representation; Anomaly detection; Automated log analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Logs are an essential source of information for people to understand the running status of a software system. Due to the evolving modern software architecture and maintenance methods, more research efforts have been devoted to automated log analysis. In particular, machine learning (ML) has been widely used in log analysis tasks. In ML-based log analysis tasks, converting textual log data into numerical feature vectors is a critical and indispensable step. However, the impact of using different log representation techniques on the performance of the downstream models is not clear, which limits researchers and practitioners’ opportunities of choosing the optimal log representation techniques in their automated log analysis workflows. Therefore, this work investigates and compares the commonly adopted log representation techniques from previous log analysis research. Particularly, we select six log representation techniques and evaluate them with seven ML models and four public log datasets (i.e., HDFS, BGL, Spirit and Thunderbird) in the context of log-based anomaly detection.We also examine the impacts of the log parsing process and the different feature aggregation approaches when they are employed with log representation techniques. From the experiments, we provide some heuristic guidelines for future researchers and developers to follow when designing an automated log analysis workflow. We believe our comprehensive comparison of log representation techniques can help researchers and practitioners better understand the characteristics of different log representation techniques and provide them with guidance for selecting the most suitable ones for their ML-based log analysis workflow.
引用
收藏
相关论文
共 50 条
  • [41] Try with Simpler-An Evaluation of Improved Principal Component Analysis in Log-based Anomaly Detection
    Yang, Lin
    Chen, Junjie
    Gao, Shutao
    Gong, Zhihao
    Zhang, Hongyu
    Kang, Yue
    Li, Huaan
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
  • [42] A Log-Based Anomaly Detection Method with Efficient Neighbor Searching and Automatic K Neighbor Selection
    Wang, Bingming
    Ying, Shi
    Yang, Zhe
    [J]. SCIENTIFIC PROGRAMMING, 2020, 2020 (2020)
  • [43] ASGNet: Adaptive Semantic Gate Networks for Log-Based Anomaly Diagnosis
    Yang, Haitian
    Sun, Degang
    Liu, Wen
    Li, Yanshu
    Wang, Yan
    Huang, Weiqing
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 200 - 212
  • [44] Log anomaly detection based on BERT
    Tang, Pan
    Guan, Yepeng
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 6431 - 6441
  • [45] System Log-Based Android Root State Detection
    Jin, Junjie
    Zhang, Wei
    [J]. CLOUD COMPUTING AND SECURITY, PT II, 2017, 10603 : 793 - 798
  • [46] Event Log-based Weaknesses Detection in Business Processes
    Schuh, G.
    Guetzlaff, A.
    Schmitz, S.
    Schopen, M.
    Broehl, F.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEE IEEM21), 2021, : 734 - 738
  • [47] Log-based Predictive Maintenance
    Sipos, Ruben
    Fradkin, Dmitriy
    Moerchen, Fabian
    Wang, Zhuang
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1867 - 1876
  • [48] PLELog: Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation
    Yang, Lin
    Chen, Junjie
    Wang, Zan
    Wang, Weijing
    Jiang, Jiajun
    Dong, Xuyuan
    Zhang, Wenbin
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 230 - 231
  • [49] Log-Based Process Visualization
    Schwank, Johannes
    Schoeffel, Sebastian
    Ebert, Achim
    [J]. ADVANCES IN USABILITY, USER EXPERIENCE AND ASSISTIVE TECHNOLOGY, 2019, 794 : 741 - 751
  • [50] Log-based anomaly detection for distributed systems: State of the art, industry experience, and open issues
    Wei, Xinjie
    Wang, Jie
    Sun, Chang-ai
    Towey, Dave
    Zhang, Shoufeng
    Zuo, Wanqing
    Yu, Yiming
    Ruan, Ruoyi
    Song, Guyang
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (08)