Detecting log anomaly using subword attention encoder and probabilistic feature selection

被引:0
|
作者
Hariharan, M. [1 ]
Mishra, Abhinesh [1 ]
Ravi, Sriram [1 ]
Sharma, Ankita [1 ]
Tanwar, Anshul [1 ]
Sundaresan, Krishna [1 ]
Ganesan, Prasanna [1 ]
Karthik, R. [2 ]
机构
[1] Cisco Syst India Pvt Ltd, Bengaluru, India
[2] Vellore Inst Technol, Ctr Cyber Phys Syst, Chennai, India
关键词
Deep learning; Self-attention; Naive Bayes; Syslog; Anomaly detection; Encoder-decoder;
D O I
10.1007/s10489-023-04674-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Log anomaly is a manifestation of a software system error or security threat. Detecting such unusual behaviours across logs in real-time is the driving force behind large-scale autonomous monitoring technology that can rapidly alert zero-day attacks. Increasingly, AI methods are being used to process voluminous log datasets and reveal patterns of correlated anomaly. In this paper, we propose an enhanced approach to learning semantic-aware embeddings for logs called the Subword Encoder Neural network (SEN). Solving upon a key limitation of previous semantic log parsing works, the proposed work introduces the concept of learning word vectors from subword-level granularity using an attention encoder strategy. The learnt embeddings reflect the contextual/lexical relationships at the word level. As a result, the learnt word representations precisely capture new log messages previously not seen by the model. Furthermore, we develop a novel feature distillation algorithm termed Naive Bayes Feature Selector (NBFS) to extract useful log events. This probabilistic technique examines the occurrence pattern of events to only select the salient ones that can aid anomaly detection. To our best knowledge, this is the first attempt to associate affinity to log events based on the target task. Since the predictions can be traced to the log messages, the AI is inherently explainable too. The model outperforms state-of-the-art methods by a fair margin. It achieves a 0.99 detection F1-score on the benchmarked BGL, HDFS and OpenStack log datasets.
引用
收藏
页码:22297 / 22312
页数:16
相关论文
共 50 条
  • [41] Speech emotion classification using attention based network and regularized feature selection
    Samson Akinpelu
    Serestina Viriri
    Scientific Reports, 13
  • [42] Speech emotion classification using attention based network and regularized feature selection
    Akinpelu, Samson
    Viriri, Serestina
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [43] Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection
    Du, Xiuquan
    Li, Xinrui
    Li, Wen
    Yan, Yuanting
    Zhang, Yanping
    CURRENT BIOINFORMATICS, 2018, 13 (06) : 625 - 632
  • [44] Deep fake detection using cascaded deep sparse auto-encoder for effective feature selection
    Balasubramanian, Saravana Balaji
    Kannan, Jagadeesh R.
    Prabu, P.
    Venkatachalam, K.
    Trojovsky, Pavel
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [45] River flood prediction through flow level modeling using multi-attention encoder-decoder-based TCN with filter-wrapper feature selection
    Jeba, G. Selva
    Chitra, P.
    EARTH SCIENCE INFORMATICS, 2024, 17 (06) : 5233 - 5249
  • [46] Satellite Telemetry Data Anomaly Detection Using Causal Network and Feature-Attention-Based LSTM
    Zeng, Zefan
    Jin, Guang
    Xu, Chi
    Chen, Siya
    Zeng, Zhelong
    Zhang, Lu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [47] Anomaly Detection of Internet Traffic using Robust Feature Selection based on Kernel Density Estimation
    Leal, Sara Faria
    Rosario Oliveira, M.
    Valadas, Rui
    2015 EUROPEAN CONFERENCE ON NETWORKS AND COMMUNICATIONS (EUCNC), 2015, : 482 - 486
  • [48] Detecting, localizing and classifying visual traits from arbitrary viewpoints using probabilistic local feature modeling
    Toews, Matthew
    Arbel, Tal
    ANALYSIS AND MODELING OF FACES AND GESTURES, PROCEEDINGS, 2007, 4778 : 154 - 167
  • [49] GNSS jamming detection using attention-based mutual information feature selection
    Ali Reda
    Tamer Mekkawy
    Discover Applied Sciences, 6
  • [50] GNSS jamming detection using attention-based mutual information feature selection
    Reda, Ali
    Mekkawy, Tamer
    DISCOVER APPLIED SCIENCES, 2024, 6 (04)