Learning from Time Series with Outlier Correction for Malicious Domain Identification

被引:0
|
作者
Tan, Guolin [1 ,2 ]
Zhang, Peng [1 ]
Zhang, Lei [1 ,2 ]
Zhang, Yu [1 ,2 ]
Zhang, Chuang [1 ]
Liu, Qingyun [1 ]
Liu, Xinran [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Natl Comp Network Emergency Response & Coordinat, Beijing, Peoples R China
关键词
D O I
10.1109/ISSREW.2019.00040
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Malicious domain identification is an important task in the field of cyberspace security. However, most of existing work for this task heavily relies on expert experience when constructing machine learning features. What makes matters worse is that these features can be deliberately changed by attackers. As a result, such malicious domain identification methods are easily bypassed by cyber criminals. To solve this problem, in this paper, we propose a novel method for malicious domain identification by effectively learning time series shapelets, the discriminative local patterns of time series. More specifically, our method consists of two main components: 1) modeling user's habits of accessing domains by learning shapelets from domain time series. As the domain time series is generated by the crowd visiting websites, the learned user's habits of accessing domains can potentially reflect what type of service a domain provides, such as pornography, gambling and so on. 2) an outlier correction algorithm designed for a single time series and independent of the model which can enhance the robustness of shapelet initialization. We integrate shapelet learning and outlier correction in our model. Extensive experiments on real-world dataset demonstrates that our proposed method has better performance compared with state-of-the-art methods.
引用
收藏
页码:42 / 46
页数:5
相关论文
共 50 条
  • [41] A DYNAMIC OUTLIER DETECTION METHOD OF BIOMEDICAL TIME SERIES
    Liu, F.
    Su, W. X.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2016, 119 : 38 - 38
  • [42] Detecting outlier samples in multivariate time series dataset
    Weng, Xiaoqing
    Shen, Junyi
    KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 807 - 812
  • [43] BIAS CORRECTION IN THE FREQUENCY-DOMAIN ESTIMATION OF TIME-SERIES MODELS
    MILHOJ, A
    BIOMETRIKA, 1984, 71 (01) : 91 - 99
  • [44] Outlier detection in multivariate time series by projection pursuit
    Galeano, Pedro
    Pena, Daniel
    Tsay, Ruey S.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (474) : 654 - 669
  • [45] Outlier Detection in Weight Time Series of Connected Scales
    Mehrang, Saeed
    Helander, Elina
    Pavel, Misha
    Chieh, Angela
    Korhonen, Ilkka
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1489 - 1496
  • [46] A Review on Outlier/Anomaly Detection in Time Series Data
    Blazquez-Garcia, Ane
    Conde, Angel
    Mori, Usue
    Lozano, Jose A.
    ACM COMPUTING SURVEYS, 2022, 54 (03)
  • [47] Outlier Detection based on Transformations for Astronomical Time Series
    Romero, Mauricio
    Estevez, Pablo A.
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [48] TODS: An Automated Time Series Outlier Detection System
    Lai, Kwei-Herng
    Zha, Daochen
    Wang, Guanchu
    Xu, Junjie
    Zhao, Yue
    Kumar, Devesh
    Chen, Yile
    Zumkhawaka, Purav
    Wan, Minyang
    Martinez, Diego
    Hu, Xia
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16060 - 16062
  • [49] Discovery of motifs to forecast outlier occurrence in time series
    Martinez-Alvarez, F.
    Troncoso, A.
    Riquelme, J. C.
    Aguilar-Ruiz, J. S.
    PATTERN RECOGNITION LETTERS, 2011, 32 (12) : 1652 - 1665
  • [50] UNSUPERVISED ANOMALY DETECTION FOR TIME SERIES WITH OUTLIER EXPOSURE
    Feng, Jiaming
    Huang, Zheng
    Guo, Jie
    Qiu, Weidong
    33RD INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2021), 2020, : 1 - 12