Hierarchical Management of Large-Scale Malware Data

被引:0
|
作者
Kellogg, Lee [1 ]
Ruttenberg, Brian [1 ]
O'Connor, Alison [1 ]
Howard, Michael [1 ]
Pfeffer, Avi [1 ]
机构
[1] Charles River Analyt, 625 Mt Auburn St, Cambridge, MA 02138 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the pace of generation of new malware accelerates, clustering and classifying newly discovered malware requires new approaches to data management. We describe our Big Data approach to managing malware to support effective and efficient malware analysis on large and rapidly evolving sets of malware. The key element of our approach is a hierarchical organization of the malware, which organizes malware into families, maintains a rich description of the relationships between malware, and facilitates efficient online analysis of new malware as they are discovered. Using clustering evaluation metrics, we show that our system discovers malware families comparable to those produced by traditional hierarchical clustering algorithms, while scaling much better with the size of the data set. We also show the flexibility of our system as it relates to substituting various data representations, methods of comparing malware binaries, clustering algorithms, and other factors. Our approach will enable malware analysts and investigators to quickly understand and quantify changes in the global malware ecosystem.
引用
下载
收藏
页码:666 / 674
页数:9
相关论文
共 50 条
  • [41] The Circle Of Life: A Large-Scale Study of The IoT Malware Lifecycle
    Alrawi, Omar
    Lever, Charles
    Valakuzhy, Kevin
    Court, Ryan
    Snow, Kevin
    Monrose, Fabian
    Antonakakis, Manos
    PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 3505 - 3522
  • [42] Hierarchical Quality Monitoring for Large-Scale Industrial Plants With Big Process Data
    Yao, Le
    Shao, Weiming
    Ge, Zhiqiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (08) : 3330 - 3341
  • [43] Hierarchical infrastructure for large-scale distributed privacy-preserving data mining
    Wang, JL
    Xu, CF
    Shen, HF
    Pan, YH
    COMPUTATIONAL SCIENCE - ICCS 2005, PT 3, 2005, 3516 : 1020 - 1023
  • [44] Data Management for Large-Scale Position-Tracking Systems
    Inoue, Fumiaki
    Zhang, Yongbing
    Ji, Yusheng
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2011, E94B (01) : 45 - 54
  • [45] Guest Editorial: Large-scale Data Management for Mobile Applications
    Thierry Delot
    Sandra Geisler
    Sergio Ilarri
    Christoph Quix
    Distributed and Parallel Databases, 2016, 34 : 1 - 2
  • [46] BASIC: an Alternative to BASE for Large-Scale Data Management System
    Wu, Lengdong
    Yuan, Li-Yan
    You, Jia-Huai
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 5 - 14
  • [47] Efficient data management in a large-scale epidemiology research project
    Meyer, Jens
    Ostrzinski, Stefan
    Fredrich, Daniel
    Havemann, Christoph
    Krafczyk, Janina
    Hoffmann, Wolfgang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 107 (03) : 425 - 435
  • [48] SNPP: automating large-scale SNP genotype data management
    Zhao, LJ
    Li, MX
    Guo, YF
    Xu, FH
    Li, JL
    Deng, HW
    BIOINFORMATICS, 2005, 21 (02) : 266 - 268
  • [49] Guest Editorial: Large-scale Data Management for Mobile Applications
    Delot, Thierry
    Geisler, Sandra
    Ilarri, Sergio
    Quix, Christoph
    DISTRIBUTED AND PARALLEL DATABASES, 2016, 34 (01) : 1 - 2
  • [50] Data and animal management software for large-scale phenotype screening
    Keith A. Ching
    Michael P. Cooke
    Lisa M. Tarantino
    Hilmar Lapp
    Mammalian Genome, 2006, 17 : 288 - 297