Hierarchical Management of Large-Scale Malware Data

被引:0
|
作者
Kellogg, Lee [1 ]
Ruttenberg, Brian [1 ]
O'Connor, Alison [1 ]
Howard, Michael [1 ]
Pfeffer, Avi [1 ]
机构
[1] Charles River Analyt, 625 Mt Auburn St, Cambridge, MA 02138 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the pace of generation of new malware accelerates, clustering and classifying newly discovered malware requires new approaches to data management. We describe our Big Data approach to managing malware to support effective and efficient malware analysis on large and rapidly evolving sets of malware. The key element of our approach is a hierarchical organization of the malware, which organizes malware into families, maintains a rich description of the relationships between malware, and facilitates efficient online analysis of new malware as they are discovered. Using clustering evaluation metrics, we show that our system discovers malware families comparable to those produced by traditional hierarchical clustering algorithms, while scaling much better with the size of the data set. We also show the flexibility of our system as it relates to substituting various data representations, methods of comparing malware binaries, clustering algorithms, and other factors. Our approach will enable malware analysts and investigators to quickly understand and quantify changes in the global malware ecosystem.
引用
下载
收藏
页码:666 / 674
页数:9
相关论文
共 50 条
  • [1] Hierarchical visual data mining for large-scale data
    Ward, M
    Peng, W
    Wang, XN
    COMPUTATIONAL STATISTICS, 2004, 19 (01) : 147 - 158
  • [2] Hierarchical visual data mining for large-scale data
    Matthew Ward
    Wei Peng
    Xiaoning Wang
    Computational Statistics, 2004, 19 : 147 - 158
  • [3] Hierarchical Resource Management for Enhancing Performance of Large-scale Simulations on Data Centers
    Li, ZengXiang
    Li, Xiaorong
    Wang, Long
    Cai, Wentong
    SIGSIM-PADS'14: PROCEEDINGS OF THE 2014 ACM CONFERENCE ON SIGSIM PRINCIPLES OF ADVANCED DISCRETE SIMULATION, 2014, : 187 - 196
  • [4] HIERARCHICAL OBB-SPHERE TREE FOR LARGE-SCALE RANGE DATA MANAGEMENT
    Nguyen, Hoang-Phong
    Hong, Seungpyo
    Kim, Jinwook
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 839 - 843
  • [5] Malware Propagation in Large-Scale Networks
    Yu, Shui
    Gu, Guofei
    Barnawi, Ahmed
    Guo, Song
    Stojmenovic, Ivan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (01) : 170 - 179
  • [6] Android Malware Development on Public Malware Scanning Platforms: A Large-scale Data-driven Study
    Huang, Heqing
    Zheng, Cong
    Zeng, Junyuan
    Zhou, Wu
    Zhu, Sencun
    Liu, Peng
    Chari, Suresh
    Zhang, Ce
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1090 - 1099
  • [7] Hierarchical information combination in large-scale multiagent resource management
    Yadgar, O
    Kraus, S
    Ortiz, CL
    COMMUNICATION IN MULTIAGENT SYSTEMS: AGENT COMMUNICATION LANGUAGES AND CONVERSATION POLICIES, 2003, 2650 : 129 - 145
  • [8] Learning hierarchical Bayesian networks for large-scale data analysis
    Hwang, Kyu-Baek
    Kim, Byoung-Hee
    Zhang, Byoung-Tak
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2006, 4232 : 670 - 679
  • [9] HIERARCHICAL SYMBOLIC DESIGN METHODOLOGY FOR LARGE-SCALE DATA PATHS
    USAMI, K
    SUGENO, Y
    MATSUMOTO, N
    MORI, S
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1991, 26 (03) : 381 - 385
  • [10] Nazca: Detecting Malware Distribution in Large-Scale Networks
    Invernizzi, Luca
    Miskovic, Stanislav
    Torres, Ruben
    Saha, Sabyasachi
    Lee, Sung-Ju
    Mellia, Marco
    Kruegel, Christopher
    Vigna, Giovanni
    21ST ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2014), 2014,