Valid Probabilistic Anomaly Detection Models for System Logs

被引:3
|
作者
Liu, Chunbo [1 ]
Pan, Lanlan [2 ]
Gu, Zhaojun [1 ]
Wang, Jialiang [2 ]
Ren, Yitong [2 ]
Wang, Zhi [3 ]
机构
[1] Civil Aviat Univ China, Informat Secur Evaluat Ctr, Tianjin 300300, Peoples R China
[2] Civil Aviat Univ China, Coll Comp Sci & Technol, Tianjin 300300, Peoples R China
[3] Nankai Univ, Coll Cyber Sci, Tianjin 300350, Peoples R China
基金
美国国家科学基金会;
关键词
37;
D O I
10.1155/2020/8827185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
System logs can record the system status and important events during system operation in detail. Detecting anomalies in the system logs is a common method for modern large-scale distributed systems. Yet threshold-based classification models used for anomaly detection output only two values: normal or abnormal, which lacks probability of estimating whether the prediction results are correct. In this paper, a statistical learning algorithm Venn-Abers predictor is adopted to evaluate the confidence of prediction results in the field of system log anomaly detection. It is able to calculate the probability distribution of labels for a set of samples and provide a quality assessment of predictive labels to some extent. Two Venn-Abers predictors LR-VA and SVM-VA have been implemented based on Logistic Regression and Support Vector Machine, respectively. Then, the differences among different algorithms are considered so as to build a multimodel fusion algorithm by Stacking. And then a Venn-Abers predictor based on the Stacking algorithm called Stacking-VA is implemented. The performances of four types of algorithms (unimodel, Venn-Abers predictor based on unimodel, multimodel, and Venn-Abers predictor based on multimodel) are compared in terms of validity and accuracy. Experiments are carried out on a log dataset of the Hadoop Distributed File System (HDFS). For the comparative experiments on unimodels, the results show that the validities of LR-VA and SVM-VA are better than those of the two corresponding underlying models. Compared with the underlying model, the accuracy of the SVM-VA predictor is better than that of LR-VA predictor, and more significantly, the recall rate increases from 81% to 94%. In the case of experiments on multiple models, the algorithm based on Stacking multimodel fusion is significantly superior to the underlying classifier. The average accuracy of Stacking-VA is larger than 0.95, which is more stable than the prediction results of LR-VA and SVM-VA. Experimental results show that the Venn-Abers predictor is a flexible tool that can make accurate and valid probability predictions in the field of system log anomaly detection.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A Survey of Deep Anomaly Detection for System Logs
    Zhao, Xiaoqing
    Jiang, Zhongyuan
    Ma, Jianfeng
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [2] System anomaly detection: Mining firewall logs
    Winding, Robert
    Wright, Timothy
    Chapple, Michael
    [J]. 2006 SECURECOMM AND WORKSHOPS, 2006, : 389 - +
  • [3] DeepEAD: Explainable Anomaly Detection from System Logs
    Wang, Xinda
    Kim, Kyeong Jin
    Wang, Ye
    Koike-Akino, Toshiaki
    Parsons, Kieran
    [J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 771 - 776
  • [4] AutoLog: Anomaly detection by deep autoencoding of system logs
    Catillo, Marta
    Pecchia, Antonio
    Villano, Umberto
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [5] An Integrated Method for Anomaly Detection From Massive System Logs
    Liu, Zhaoli
    Qin, Tao
    Guan, Xiaohong
    Jiang, Hezhi
    Wang, Chenxu
    [J]. IEEE ACCESS, 2018, 6 : 30602 - 30611
  • [6] Latent Variable Based Anomaly Detection in Network System Logs
    Otomo, Kazuki
    Kobayashi, Satoru
    Fukuda, Kensuke
    Esaki, Hiroshi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (09) : 1644 - 1652
  • [7] ConAnomaly: Content-Based Anomaly Detection for System Logs
    Lv, Dan
    Luktarhan, Nurbol
    Chen, Yiyong
    [J]. SENSORS, 2021, 21 (18)
  • [8] Anomaly Detection on System Generated Logs-A Survey Study
    Jose, Jisha M.
    Reeja, S. R.
    [J]. MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 779 - 793
  • [9] Anomaly Detection Using System Logs: A Deep Learning Approach
    Sinha, Rohit
    Sur, Rittika
    Sharma, Ruchi
    Shrivastava, Avinash K.
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2022, 16 (01)
  • [10] Adanomaly: Adaptive Anomaly Detection for System Logs with Adversarial Learning
    Qi, Jiaxing
    Luan, Zhongzhi
    Huang, Shaohan
    Wang, Yukun
    Fung, Carol
    Yang, Hailong
    Qian, Depei
    [J]. PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,