Learning from Time Series with Outlier Correction for Malicious Domain Identification

被引:0
|
作者
Tan, Guolin [1 ,2 ]
Zhang, Peng [1 ]
Zhang, Lei [1 ,2 ]
Zhang, Yu [1 ,2 ]
Zhang, Chuang [1 ]
Liu, Qingyun [1 ]
Liu, Xinran [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Natl Comp Network Emergency Response & Coordinat, Beijing, Peoples R China
关键词
D O I
10.1109/ISSREW.2019.00040
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Malicious domain identification is an important task in the field of cyberspace security. However, most of existing work for this task heavily relies on expert experience when constructing machine learning features. What makes matters worse is that these features can be deliberately changed by attackers. As a result, such malicious domain identification methods are easily bypassed by cyber criminals. To solve this problem, in this paper, we propose a novel method for malicious domain identification by effectively learning time series shapelets, the discriminative local patterns of time series. More specifically, our method consists of two main components: 1) modeling user's habits of accessing domains by learning shapelets from domain time series. As the domain time series is generated by the crowd visiting websites, the learned user's habits of accessing domains can potentially reflect what type of service a domain provides, such as pornography, gambling and so on. 2) an outlier correction algorithm designed for a single time series and independent of the model which can enhance the robustness of shapelet initialization. We integrate shapelet learning and outlier correction in our model. Extensive experiments on real-world dataset demonstrates that our proposed method has better performance compared with state-of-the-art methods.
引用
收藏
页码:42 / 46
页数:5
相关论文
共 50 条
  • [1] Bias correction for outlier estimation in time series
    Battaglia, Francesco
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (11) : 3904 - 3930
  • [2] Identification of outlier patterns in multivariate time series
    Weng, Xiao-Qing
    Shen, Jun-Yi
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2007, 20 (03): : 336 - 342
  • [3] Time-Series Measurement of Parked Domain Names and Their Malicious Uses
    Tomatsuri, Takayuki
    Chiba, Daiki
    Akiyama, Mitsuaki
    Uchida, Masato
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2021, E104B (07) : 770 - 780
  • [4] Data-Driven Pattern Identification and Outlier Detection in Time Series
    Khoshrou, Abdolrahman
    Pauwels, Eric J.
    INTELLIGENT COMPUTING, VOL 1, 2019, 858 : 471 - 484
  • [5] Outlier identifiability in time series
    Battaglia, Francesco
    Cucina, Domenico
    STAT, 2020, 9 (01):
  • [6] Outlier Filtering for Identification of Gene Regulations in Microarray Time-Series Data
    Yang, Andy C.
    Hsu, Hui-Huang
    Lu, Ming-Da
    CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 854 - 859
  • [7] IDENTIFICATION OF PULSES IN HORMONE TIME-SERIES USING OUTLIER DETECTION METHODS
    THOMAS, G
    PLU, G
    THALABARD, JC
    STATISTICS IN MEDICINE, 1992, 11 (16) : 2133 - 2145
  • [8] Outlier detection for stationary time series
    Choy, K
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2001, 99 (02) : 111 - 127
  • [9] Outlier detection in time series data
    Choi, Jeong In
    Um, In Ok
    Cho, Hyung Jun
    KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (05) : 907 - 920
  • [10] Unsupervised Clustering for Identification of Malicious Domain Campaigns
    Weber, Michael
    Wang, Jun
    Zhou, Yuchen
    PROCEEDINGS OF THE FIRST WORKSHOP ON RADICAL AND EXPERIENTIAL SECURITY (RESEC'18), 2018, : 33 - 39