Detecting log anomaly using subword attention encoder and probabilistic feature selection

被引:0
|
作者
Hariharan, M. [1 ]
Mishra, Abhinesh [1 ]
Ravi, Sriram [1 ]
Sharma, Ankita [1 ]
Tanwar, Anshul [1 ]
Sundaresan, Krishna [1 ]
Ganesan, Prasanna [1 ]
Karthik, R. [2 ]
机构
[1] Cisco Syst India Pvt Ltd, Bengaluru, India
[2] Vellore Inst Technol, Ctr Cyber Phys Syst, Chennai, India
关键词
Deep learning; Self-attention; Naive Bayes; Syslog; Anomaly detection; Encoder-decoder;
D O I
10.1007/s10489-023-04674-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Log anomaly is a manifestation of a software system error or security threat. Detecting such unusual behaviours across logs in real-time is the driving force behind large-scale autonomous monitoring technology that can rapidly alert zero-day attacks. Increasingly, AI methods are being used to process voluminous log datasets and reveal patterns of correlated anomaly. In this paper, we propose an enhanced approach to learning semantic-aware embeddings for logs called the Subword Encoder Neural network (SEN). Solving upon a key limitation of previous semantic log parsing works, the proposed work introduces the concept of learning word vectors from subword-level granularity using an attention encoder strategy. The learnt embeddings reflect the contextual/lexical relationships at the word level. As a result, the learnt word representations precisely capture new log messages previously not seen by the model. Furthermore, we develop a novel feature distillation algorithm termed Naive Bayes Feature Selector (NBFS) to extract useful log events. This probabilistic technique examines the occurrence pattern of events to only select the salient ones that can aid anomaly detection. To our best knowledge, this is the first attempt to associate affinity to log events based on the target task. Since the predictions can be traced to the log messages, the AI is inherently explainable too. The model outperforms state-of-the-art methods by a fair margin. It achieves a 0.99 detection F1-score on the benchmarked BGL, HDFS and OpenStack log datasets.
引用
收藏
页码:22297 / 22312
页数:16
相关论文
共 50 条
  • [31] Feature Selection Mechanism for Attention Classification using Gaze Tracking Data
    Khan, Ahsan Raza
    Bokhari, Syed Mohsin
    Khosravi, Sara
    Hussain, Sajjad
    Ghannam, Rami
    Imran, Muhammad Ali
    Zoha, Ahmed
    2022 29TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (IEEE ICECS 2022), 2022,
  • [32] An Ensemble Framework of Anomaly Detection using Hybridized Feature Selection Approach (HFSA)
    Haq, Nutan Farah
    Onik, Abdur Rahman
    Shah, Faisal Muhammad
    2015 SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2015, : 989 - 995
  • [33] Machine Learning Based Anomaly Detection of Log Files Using Ensemble Learning and Self-Attention
    Falt, Markus
    Forsstrom, Stefan
    Zhang, Tingting
    2021 5TH INTERNATIONAL CONFERENCE ON SYSTEM RELIABILITY AND SAFETY (ICSRS 2021), 2021, : 209 - 215
  • [34] Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks
    Kobayashi, Shimpei
    Hizukuri, Akiyoshi
    Nakayama, Ryohei
    2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [35] DAM: Hierarchical Adaptive Feature Selection Using Convolution Encoder Decoder Network for Strawberry Segmentation
    Ilyas, Talha
    Umraiz, Muhammad
    Khan, Abbas
    Kim, Hyongsuk
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [36] Mutation mayfly algorithm (MMA) based feature selection and probabilistic anomaly detection model for cyber-physical systems
    Vignesh, C. Babu
    Arul, E.
    Mahavishnu, V. C.
    Punidha, A.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (12) : 5454 - 5468
  • [37] Anomaly Intrusion Detection Using Incremental Learning of an Infinite Mixture Model with Feature Selection
    Fan, Wentao
    Bouguila, Nizar
    Sallay, Hassen
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY: 8TH INTERNATIONAL CONFERENCE, 2013, 8171 : 364 - 373
  • [38] A GAN-based anomaly detector using multi-feature fusion and selection
    Dai, Huafeng
    Wang, Jyunrong
    Zhong, Quan
    Chen, Taogen
    Liu, Hao
    Zhang, Xuegang
    Lu, Rongsheng
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [39] An Advanced Fitness Function Optimization Algorithm for Anomaly Intrusion Detection Using Feature Selection
    Hong, Sung-Sam
    Lee, Eun-joo
    Kim, Hwayoung
    APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [40] Feature and Weight Selection Using Tabu Search for Improving the Recognition Rate of Duct Anomaly
    Wang Yongxiong
    Kai Li
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS IEEE-ROBIO 2014, 2014, : 2163 - 2168