An empirical study of the impact of log parsers on the performance of log-based anomaly detection

被引:0
|
作者
Ying Fu
Meng Yan
Zhou Xu
Xin Xia
Xiaohong Zhang
Dan Yang
机构
[1] Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University),School of Big Data and Software Engineering
[2] Ministry of Education,Software Engineering Application Technology Lab
[3] Chongqing University,undefined
[4] Huawei,undefined
来源
关键词
Log parser; Anomaly detection; Empirical study;
D O I
暂无
中图分类号
学科分类号
摘要
Log-based anomaly detection plays an essential role in the fast-emerging Artificial Intelligence for IT Operations (AIOps) of software systems. Many log-based anomaly detection methods have been proposed. Due to the variety and unstructured characteristics of logs, log parsing is the first necessary step for parsing logs into structured ones in log-based anomaly detection methods. Prior studies have found that the effectiveness of log parsing will impact the performance of log-based anomaly detection. However, few studies comprehensively investigate whether better log parsing implies better anomaly detection. In this paper, we conduct a comprehensively empirical study to investigate the impact of six state-of-the-art log parsers belonging to four categories (including heuristic-based, frequency-based, clustering-based, and subsequence-based) on six state-of-the-art log-based anomaly detection methods (including machine-learning-based and deep-learning-based methods). Experimental results on three public datasets show that (1) High parsing accuracy does not definitely imply high anomaly detection performance. Both parsing accuracy and the number of parsed event templates should be considered when choosing log parsers for anomaly detection. (2) The log parsers have an impact on the efficiency of anomaly detection methods. With the increase in the number of parsed event templates, the efficiency of anomaly detection decreases. In detail, the heuristic-based parsers have less impact on the efficiency of anomaly detection methods, followed by frequency-based parsers. (3) All the anomaly detection methods perform more effectively and efficiently with the heuristic-based log parsers. Thus, the heuristic-based log parsers are recommended for a new practitioner on anomaly detection. We believe that our work, with the evaluation results and the corresponding findings, can help researchers and practitioners better understand the impact of log parsers on anomaly detection and provide guidelines for choosing a suitable log parser for their anomaly detection method.
引用
收藏
相关论文
共 50 条
  • [1] An empirical study of the impact of log parsers on the performance of log-based anomaly detection
    Fu, Ying
    Yan, Meng
    Xu, Zhou
    Xia, Xin
    Zhang, Xiaohong
    Yang, Dan
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (01)
  • [2] Log-based Anomaly Detection Without Log Parsing
    Van-Hoang Le
    Zhang, Hongyu
    [J]. 2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 492 - 504
  • [3] Leveraging Log Instructions in Log-based Anomaly Detection
    Bogatinovski, Jasmin
    Madjarov, Gjorgji
    Nedelkoski, Sasho
    Cardoso, Jorge
    Kao, Odej
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2022), 2022, : 321 - 326
  • [4] On the effectiveness of log representation for log-based anomaly detection
    Wu, Xingfang
    Li, Heng
    Khomh, Foutse
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (06)
  • [5] On the effectiveness of log representation for log-based anomaly detection
    Xingfang Wu
    Heng Li
    Foutse Khomh
    [J]. Empirical Software Engineering, 2023, 28
  • [6] Robust Log-Based Anomaly Detection on Unstable Log Data
    Zhang, Xu
    Xu, Yong
    Lin, Qingwei
    Qiao, Bo
    Zhang, Hongyu
    Dang, Yingnong
    Xie, Chunyu
    Yang, Xinsheng
    Cheng, Qian
    Li, Ze
    Chen, Junjie
    He, Xiaoting
    Yao, Randolph
    Lou, Jian-Guang
    Chintalapati, Murali
    Shen, Furao
    Zhang, Dongmei
    [J]. ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 807 - 817
  • [7] Review on Log-Based Anomaly Detection Techniques
    Raut, Pooja
    Mishra, Akanksha
    Rao, Shreya
    Kawoor, Saloni
    Shelke, Sushila
    Deore, Mahendra
    Kumar, Vivek
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 893 - 906
  • [8] Transfer Log-based Anomaly Detection with Pseudo Labels
    Huang, Shaohan
    Liu, Yi
    Fung, Carol
    He, Rong
    Zhao, Yining
    Yang, Hailong
    Luan, Zhongzhi
    [J]. 2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,
  • [9] An unsupervised heterogeneous log-based framework for anomaly detection
    Hajamydeen, Asif Iqbal
    Udzir, Nur Izura
    Mahmod, Ramlan
    Abdul Ghani, Abdul Azim
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (03) : 1117 - 1134
  • [10] LogDP: Combining Dependency and Proximity for Log-Based Anomaly Detection
    Xie, Yongzheng
    Zhang, Hongyu
    Zhang, Bo
    Babar, Muhammad Ali
    Lu, Sha
    [J]. SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 708 - 716