SHARKFIN: Spatio-temporal mining of software adoption and penetration

被引:1
|
作者
Papalexakis, Evangelos E. [1 ]
Dumitras, Tudor [2 ]
Chau, Duen Horng [3 ]
Prakash, B. Aditya [4 ]
Faloutsos, Christos [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
[2] Univ Maryland, Dept ECE, College Pk, MD 20742 USA
[3] Georgia Tech, Sch Computat Sci & Engn, Atlanta, GA USA
[4] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
关键词
Malware propagation; Internet security; Data analysis;
D O I
10.1007/s13278-014-0240-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
How does malware propagate? Does it form spikes over time? Does it resemble the propagation pattern of benign files, such as software patches? Does it spread uniformly over countries? How long does it take for a URL that distributes malware to be detected and shut down? In this work, we answer these questions by analyzing patterns from 22 million malicious (and benign) files, found on 1.6 million hosts worldwide during the month of June 2011. We conduct this study using the WINE database available at Symantec Research Labs. Additionally, we explore the research questions raised by sampling on such large databases of executables; the importance of studying the implications of sampling is twofold: First, sampling is a means of reducing the size of the database hence making it more accessible to researchers; second, because every such data collection can be perceived as a sample of the real world. We discover the SHARKFIN temporal propagation pattern of executable files, the GEOSPLIT pattern in the geographical spread of machines that report executables to Symantec's servers, the Periodic Power Law (PPL) distribution of the lifetime of URLs, and we show how to efficiently extrapolate crucial properties of the data from a small sample. We further investigate the propagation pattern of benign and malicious executables, unveiling latent structures in the way these files spread. To the best of our knowledge, our work represents the largest study of propagation patterns of executables.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [21] Mining Rainfall Spatio-Temporal Patterns in Twitter: A Temporal Approach
    de Andrade, Sidgley Camargo
    Restrepo-Estrada, Camilo
    Delbem, Alexandre C. B.
    Mendiondo, Eduardo Mario
    de Albuquerque, Joao Porto
    SOCIETAL GEO-INNOVATION, 2017, : 19 - 37
  • [22] Mining Spatio-Temporal Data at Different Levels of Detail
    Camossi, Elena
    Bertolotto, Michela
    Kechadi, Tahar
    EUROPEAN INFORMATION SOCIETY: TAKING GEOINFORMATION SCIENCE ONE STEP FURTHER, 2009, : 225 - 240
  • [23] Spatio-temporal data mining in ecological and veterinary epidemiology
    Aristides Moustakas
    Stochastic Environmental Research and Risk Assessment, 2017, 31 : 829 - 834
  • [24] Spatio-Temporal Routine Mining on Mobile Phone Data
    Qin, Tian
    Shangguan, Wufan
    Song, Guojie
    Tang, Jie
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (05)
  • [25] ieSTGCN:A Mining Model of Skeleton Spatio-temporal Graph
    Mao, Guojun
    Wang, Yijin
    Communications in Computer and Information Science, 2022, 1714 CCIS : 322 - 332
  • [26] Mining spatio-temporal patterns in object mobility databases
    Florian Verhein
    Sanjay Chawla
    Data Mining and Knowledge Discovery, 2008, 16 : 5 - 38
  • [27] Mining Spatio-temporal Patterns in the Presence of Concept Hierarchies
    Le Van Quoc Anh
    Gertz, Michael
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 765 - 772
  • [28] Visual mining and spatio-temporal querying in molecular dynamics
    Sourina, O
    Korolev, N
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2005, 2 (04) : 492 - 498
  • [29] Spatio-temporal Route Mining and Visualization for Busy Waterways
    Wen, Rong
    Yan, Wenjing
    Zhang, Allan Nengsheng
    Chinh, Nguyen Quoc
    Akcan, Orkan
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 849 - 854
  • [30] Mining spatio-temporal patterns in object mobility databases
    Verhein, Florian
    Chawla, Sanjay
    DATA MINING AND KNOWLEDGE DISCOVERY, 2008, 16 (01) : 5 - 38