Incremental Analysis of Large-Scale System Logs for Anomaly Detection

被引:0
|
作者
Astekin, Merve [1 ]
Ozcan, Selim [1 ]
Sozer, Hasan [2 ]
机构
[1] TUBITAK BILGEM, Inst Informat Technol, Kocaeli, Turkey
[2] Ozyegin Univ, Dept Comp Sci, Istanbul, Turkey
关键词
log analysis; distributed systems; parallel processing; anomaly detection; big data; machine learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomalies during system execution can be detected by automated analysis of logs generated by the system. However, large scale systems can generate tens of millions of lines of logs within days. Centralized implementations of traditional machine learning algorithms are not scalable for such data. Therefore, we recently introduced a distributed log analysis framework for anomaly detection. In this paper, we introduce an extension of this framework, which can detect anomalies earlier via incremental analysis instead of the existing offline analysis approach. In the extended version, we periodically process the log data that is accumulated so far. We conducted controlled experiments based on a benchmark dataset to evaluate the effectiveness of this approach. We repeated our experiments with various periods that determine the frequency of analysis as well as the size of the data processed each time. Results showed that our online analysis can improve anomaly detection time significantly while keeping the accuracy level same as that is obtained with the offline approach. The only exceptional case, where the accuracy is compromised, rarely occurs when the analysis is triggered before all the log data associated with a particular session of events are collected.
引用
收藏
页码:2119 / 2127
页数:9
相关论文
共 50 条
  • [41] Large-scale analysis of query logs to profile users for dataset search
    Sharifpour, Romina
    Wu, Mingfang
    Zhang, Xiuzhen
    JOURNAL OF DOCUMENTATION, 2023, 79 (01) : 66 - 85
  • [42] Adaptive System Anomaly Prediction for Large-Scale Hosting Infrastructures
    Tan, Yongmin
    Gu, Xiaohui
    Wang, Haixun
    PODC 2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2010, : 173 - 182
  • [43] Incremental collusive fraud detection in large-scale online auction networks
    Dadfarnia, Mahila
    Adibnia, Fazlollah
    Abadi, Mahdi
    Dorri, Ali
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (09): : 7416 - 7437
  • [44] Incremental collusive fraud detection in large-scale online auction networks
    Mahila Dadfarnia
    Fazlollah Adibnia
    Mahdi Abadi
    Ali Dorri
    The Journal of Supercomputing, 2020, 76 : 7416 - 7437
  • [45] Large-scale incremental processing with MapReduce
    Lee, Daewoo
    Kim, Jin-Soo
    Maeng, Seungryoul
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 36 : 66 - 79
  • [46] Constant Time EXPected Similarity Estimation for Large-Scale Anomaly Detection
    Schneider, Markus
    Ertel, Wolfgang
    Palm, Guenther
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 12 - 20
  • [47] Expected similarity estimation for large-scale batch and streaming anomaly detection
    Schneider, Markus
    Ertel, Wolfgang
    Ramos, Fabio
    MACHINE LEARNING, 2016, 105 (03) : 305 - 333
  • [48] Higher-Order PCA for Anomaly Detection in Large-Scale Networks
    Kim, Hayang
    Lee, Sungeun
    Ma, Xiaoli
    Wang, Chao
    2009 3RD IEEE INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2009, : 85 - 88
  • [49] Density-preserving projections for large-scale local anomaly detection
    de Vries, Timothy
    Chawla, Sanjay
    Houle, Michael E.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (01) : 25 - 52
  • [50] Density-preserving projections for large-scale local anomaly detection
    Timothy de Vries
    Sanjay Chawla
    Michael E. Houle
    Knowledge and Information Systems, 2012, 32 : 25 - 52