Detecting multi-stage attacks using sequence-to-sequence model

被引:13
|
作者
Zhou, Peng [1 ]
Zhou, Gongyan [1 ]
Wu, Dakui [1 ]
Fei, Minrui [1 ]
机构
[1] Shanghai Univ, Shanghai Key Lab Power Stn Automat Technol, Shanghai, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Multi-stage attack; Intrusion detection; Sequence-to-sequence model; Encoder-decoder architecture; Long-short term memory (LSTM) network;
D O I
10.1016/j.cose.2021.102203
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use of hidden markov model (HMM). However, despite the HMM can model the relationships and dependencies among different alerts and stages for detection, they cannot handle well the stage dependencies buried in a longer sequence of alerts. In this paper, we tackle the challenge of the stages' long-term dependency and propose a new detection solution using a sequence-to-sequence (seq2seq) model. The basic idea is to encode a sequence of alerts (i.e., detector's observation) into a latent feature vector using a long-short term memory (LSTM) network and then decode this vector to a sequence of predicted attacking stages with another LSTM. By the encoder-decoder collaboration, we can decouple the local constraint between the observed alerts and the potential attacking stages, and thus able to take the full knowledge of all the alerts for the detection of stages in a sequence basis. By the LSTM, we can learn to "forget" irrelevant alerts and thereby have more opportunities to "remember" the long-term dependency between different stages for our sequence detection. To evaluate our model's effectiveness, we have conducted extensive experiments using four public datasets, all of which include simulated or re-constructed samples of real-world multi-stage attacks in controlled testbeds. Our results have successfully confirmed the better detection performance of our model compared with the previous HMM solutions. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Detecting Web Attacks Using Multi-Stage Log Analysis
    Moh, Melody
    Pininti, Santhosh
    Doddapaneni, Sindhusha
    Moh, Teng-Sheng
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 733 - 738
  • [2] Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models
    Parry, Andrew
    Froebe, Maik
    MacAvaney, Sean
    Potthast, Martin
    Hagen, Matthias
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 286 - 302
  • [3] An Online Sequence-to-Sequence Model Using Partial Conditioning
    Jaitly, Navdeep
    Sussillo, David
    Le, Quoc V.
    Vinyals, Oriol
    Sutskever, Ilya
    Bengio, Samy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] Building a Filipino Colloquialism Translator Using Sequence-to-Sequence Model
    Nocon, Nicco
    Michelle Kho, Nyssa
    Arroyo, Jeniffer
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 2199 - 2204
  • [5] Hierarchical Sequence-to-Sequence Model for Multi-Label Text Classification
    Yang, Zhenyu
    Liu, Guojing
    IEEE ACCESS, 2019, 7 : 153012 - 153020
  • [6] MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL
    Li, Bo
    Sainath, Tara N.
    Sim, Khe Chai
    Bacchiani, Michiel
    Weinstein, Eugene
    Nguyen, Patrick
    Chen, Zhifeng
    Wu, Yonghui
    Rao, Kanishka
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4749 - 4753
  • [7] Prediction of MicroRNA Subcellular Localization by Using a Sequence-to-Sequence Model
    Xiao, Yiqun
    Cai, Jiaxun
    Yang, Yang
    Zhao, Hai
    Shen, Hong-Bin
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1332 - 1337
  • [8] Data generation using sequence-to-sequence
    Joshi, Akshat
    Mehta, Kinal
    Gupta, Neha
    Valloli, Varun Kannadi
    2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 108 - 112
  • [9] Sequence-to-sequence alignment using a pendulum
    Pribanic, Tomislav
    Lelas, Marko
    Krois, Igor
    IET COMPUTER VISION, 2015, 9 (04) : 570 - 575
  • [10] Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model
    Song, Hyun-Je
    Park, Seong-Bae
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1436 - 1441