On the effectiveness of log representation for log-based anomaly detection

被引:0
|
作者
Xingfang Wu
Heng Li
Foutse Khomh
机构
[1] Polytechnique Montreal,Department of Computer Engineering and Software Engineering
来源
关键词
Log representation; Anomaly detection; Automated log analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Logs are an essential source of information for people to understand the running status of a software system. Due to the evolving modern software architecture and maintenance methods, more research efforts have been devoted to automated log analysis. In particular, machine learning (ML) has been widely used in log analysis tasks. In ML-based log analysis tasks, converting textual log data into numerical feature vectors is a critical and indispensable step. However, the impact of using different log representation techniques on the performance of the downstream models is not clear, which limits researchers and practitioners’ opportunities of choosing the optimal log representation techniques in their automated log analysis workflows. Therefore, this work investigates and compares the commonly adopted log representation techniques from previous log analysis research. Particularly, we select six log representation techniques and evaluate them with seven ML models and four public log datasets (i.e., HDFS, BGL, Spirit and Thunderbird) in the context of log-based anomaly detection.We also examine the impacts of the log parsing process and the different feature aggregation approaches when they are employed with log representation techniques. From the experiments, we provide some heuristic guidelines for future researchers and developers to follow when designing an automated log analysis workflow. We believe our comprehensive comparison of log representation techniques can help researchers and practitioners better understand the characteristics of different log representation techniques and provide them with guidance for selecting the most suitable ones for their ML-based log analysis workflow.
引用
收藏
相关论文
共 50 条
  • [21] Improving Log-Based Anomaly Detection with Component-Aware Analysis
    Yin, Kun
    Yan, Meng
    Xu, Ling
    Xu, Zhou
    Li, Zhao
    Yang, Dan
    Zhang, Xiaohong
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 667 - 671
  • [22] LogCAD: An Efficient and Robust Model for Log-Based Conformal Anomaly Detection
    Liu, Chunbo
    Liang, Mengmeng
    Hou, Jingwen
    Gu, Zhaojun
    Wang, Zhi
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [23] Log-based Anomaly Detection with Deep Learning: How Far Are We?
    Le, Van-Hoang
    Zhang, Hongyu
    [J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1356 - 1367
  • [24] Log-based Intrusion Detection for MANET
    Alattar, Mouhannad
    Sailhan, Francoise
    Bourgeois, Julien
    [J]. 2012 8TH INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING CONFERENCE (IWCMC), 2012, : 697 - 702
  • [25] Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers
    Huang, Shaohan
    Liu, Yi
    Fung, Carol
    Wang, He
    Yang, Hailong
    Luan, Zhongzhi
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (09) : 2656 - 2667
  • [26] AFALog: A General Augmentation Framework for Log-based Anomaly Detection with Active Learning
    Duan, Chiming
    Jia, Tong
    Cai, Huaqian
    Li, Ying
    Huang, Gang
    [J]. 2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 46 - 56
  • [27] DualAttlog: Context aware dual attention networks for log-based anomaly detection
    Yang, Haitian
    Sun, Degang
    Huang, Weiqing
    [J]. NEURAL NETWORKS, 2024, 180
  • [28] Temporal Logical Attention Network for Log-Based Anomaly Detection in Distributed Systems
    Liu, Yang
    Ren, Shaochen
    Wang, Xuran
    Zhou, Mengjie
    [J]. Sensors, 2024, 24 (24)
  • [29] Hilogx: noise-aware log-based anomaly detection with human feedback
    Tong Jia
    Ying Li
    Yong Yang
    Gang Huang
    [J]. The VLDB Journal, 2024, 33 : 883 - 900
  • [30] MoniLog: An Automated Log-Based Anomaly Detection System for Cloud Computing Infrastructures
    Vervaet, Arthur
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2739 - 2743