Cross-Regional Malware Detection via Model Distilling and Federated Learning

被引:1
|
作者
Botacin, Marcus [1 ]
Gomes, Heitor [2 ]
机构
[1] Texas A&M Univ, College Stn, TX 77840 USA
[2] Victoria Univ Wellington, Wellington, New Zealand
关键词
malware; federated learning; model distilling;
D O I
10.1145/3678890.3678893
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) is a key part of modern malware detection pipelines, but its application is not straightforward. It involves multiple practical challenges that are frequently unaddressed by the literature works. A key challenge is the heterogeneity of scenarios. Antivirus (AV) companies for instance operate under different performance constraints in the backend and in the endpoint, and with a diversity of datasets according to the country they operate in. In this paper, we evaluate the impact of these heterogeneous aspects by developing a classification pipeline for 3 datasets of 10K malware samples each collected by an AV company in the USA, Brazil, and Japan in the same period. We characterize the different requirements for these datasets and we show that a different number of features is required to reach the optimal detection rate in each scenario. We show that a global model combining the three datasets increases the detection of the three individual datasets. We propose using Federated Learning (FL) to build the global model and a distilling process to generate the local versions. We order the samples temporally to show that although retraining on concept drift detection helps recover the detection rate, only a FL approach can increase the detection rate.
引用
收藏
页码:97 / 113
页数:17
相关论文
共 50 条
  • [21] Android Malware Detection via Graph Representation Learning
    Feng, Pengbin
    Ma, Jianfeng
    Li, Teng
    Ma, Xindi
    Xi, Ning
    Lu, Di
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [22] Cross-regional emergency scheduling planning for petroleum based on the supernetwork model
    Tao Lv
    Yan Nie
    Chun-Ling Wang
    Jian Gao
    Petroleum Science, 2018, 15 (03) : 666 - 679
  • [23] Optimizing android malware detection via ensemble learning
    Christiana A.O.
    Gyunka B.A.
    Oluwatobi A.N.
    Int. J. Interact. Mob. Technol., 9 (61-78): : 61 - 78
  • [24] Research on DC Cross-regional Accommodation Model Based on Scene Analysis
    Xiang Zhongming
    Chen Wenjin
    Zhang Jun
    Wang Bo
    He Xu
    Li Feng
    2019 5TH INTERNATIONAL CONFERENCE ON ENERGY EQUIPMENT SCIENCE AND ENGINEERING, 2020, 461
  • [25] Cross-regional emergency scheduling planning for petroleum based on the supernetwork model
    Lv, Tao
    Nie, Yan
    Wang, Chun-Ling
    Gao, Jian
    PETROLEUM SCIENCE, 2018, 15 (03) : 666 - 679
  • [26] A Research on Cross-Regional Debris Flow Susceptibility Mapping Based on Transfer Learning
    Gao, Ruiyuan
    Wang, Changming
    Han, Songling
    Liu, Hailiang
    Liu, Xiaoyang
    Wu, Di
    REMOTE SENSING, 2022, 14 (19)
  • [27] Distilling Discrimination and Generalization Knowledge for Event Detection via Δ-Representation Learning
    Lu, Yaojie
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4366 - 4376
  • [28] Cross-regional Customized Bus Path Planning Based on Q-learning
    Peng L.-Q.
    Luo M.-B.
    Lu H.
    Bai Y.-L.
    Jiaotong Yunshu Xitong Gongcheng Yu Xinxi/Journal of Transportation Systems Engineering and Information Technology, 2020, 20 (01): : 104 - 110
  • [29] Cross-Regional Corporations and Learning Effects in a Local Telecommunications Industry Cluster of China
    Ai C.-H.
    Wu H.-C.
    Journal of the Knowledge Economy, 2017, 8 (1) : 337 - 355
  • [30] Efficient Malaria Parasite Detection From Diverse Images of Thick Blood Smears for Cross-Regional Model Accuracy
    Zhong, Yuming
    Dan, Ying
    Cai, Yin
    Lin, Jiamin
    Huang, Xiaoyao
    Mahmoud, Omnia
    Hald, Eric S.
    Kumar, Akshay
    Fang, Qiang
    Mahmoud, Seedahmed S.
    IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2023, 4 : 226 - 233