Detecting log anomaly using subword attention encoder and probabilistic feature selection

被引:0
|
作者
M. Hariharan
Abhinesh Mishra
Sriram Ravi
Ankita Sharma
Anshul Tanwar
Krishna Sundaresan
Prasanna Ganesan
R. Karthik
机构
[1] Cisco Systems India Pvt Ltd,Center for Cyber Physical Systems
[2] Vellore Institute of Technology,undefined
来源
Applied Intelligence | 2023年 / 53卷
关键词
Deep learning; Self-attention; Naive Bayes; Syslog; Anomaly detection; Encoder-decoder;
D O I
暂无
中图分类号
学科分类号
摘要
Log anomaly is a manifestation of a software system error or security threat. Detecting such unusual behaviours across logs in real-time is the driving force behind large-scale autonomous monitoring technology that can rapidly alert zero-day attacks. Increasingly, AI methods are being used to process voluminous log datasets and reveal patterns of correlated anomaly. In this paper, we propose an enhanced approach to learning semantic-aware embeddings for logs called the Subword Encoder Neural network (SEN). Solving upon a key limitation of previous semantic log parsing works, the proposed work introduces the concept of learning word vectors from subword-level granularity using an attention encoder strategy. The learnt embeddings reflect the contextual/lexical relationships at the word level. As a result, the learnt word representations precisely capture new log messages previously not seen by the model. Furthermore, we develop a novel feature distillation algorithm termed Naive Bayes Feature Selector (NBFS) to extract useful log events. This probabilistic technique examines the occurrence pattern of events to only select the salient ones that can aid anomaly detection. To our best knowledge, this is the first attempt to associate affinity to log events based on the target task. Since the predictions can be traced to the log messages, the AI is inherently explainable too. The model outperforms state-of-the-art methods by a fair margin. It achieves a 0.99 detection F1-score on the benchmarked BGL, HDFS and OpenStack log datasets.
引用
收藏
页码:22297 / 22312
页数:15
相关论文
共 50 条
  • [11] Unsupervised Anomaly Detection Using Variational Auto-Encoder based Feature Extraction
    Yao, Rong
    Liu, Chongdang
    Zhang, Linxuan
    Peng, Peng
    2019 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2019,
  • [12] Feature Selection for Anomaly Detection Using Optical Emission Spectroscopy
    Puggini, Luca
    McLoone, Sean
    IFAC PAPERSONLINE, 2016, 49 (05): : 132 - 137
  • [13] Improving feature selection in anomaly intrusion detection using specifications
    Wang, Y
    Miner, A
    Wong, J
    Uppuluri, P
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2004, 3347 : 468 - 468
  • [14] Unsupervised probabilistic feature selection using ant colony optimization
    Dadaneh, Behrouz Zamani
    Markid, Hossein Yeganeh
    Zakerolhosseini, Ali
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 53 : 27 - 42
  • [15] Feature Selection Using Probabilistic Prediction of Support Vector Regression
    Yang, Jian-Bo
    Ong, Chong-Jin
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (06): : 954 - 962
  • [16] Multiple feature set with feature selection for anomaly search in videos using hybrid classification
    A. Srinivasan
    V. K. Gnanavel
    Multimedia Tools and Applications, 2019, 78 : 7713 - 7725
  • [17] Optimal interval and feature selection in activity data for detecting attention deficit hyperactivity disorder
    Shafna, V.
    S.D., Madhu Kumar
    Computers in Biology and Medicine, 2024, 179
  • [18] Multiple feature set with feature selection for anomaly search in videos using hybrid classification
    Srinivasan, A.
    Gnanavel, V. K.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (06) : 7713 - 7725
  • [19] Nonlinear feature selection using sparsity-promoted centroid-encoder
    Ghosh, Tomojit
    Kirby, Michael
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21883 - 21902
  • [20] Nonlinear feature selection using sparsity-promoted centroid-encoder
    Tomojit Ghosh
    Michael Kirby
    Neural Computing and Applications, 2023, 35 : 21883 - 21902