Machine learning based mobile malware detection using highly imbalanced network traffic

被引:104
|
作者
Chen, Zhenxiang [1 ,2 ]
Yan, Qiben [3 ]
Han, Hongbo [1 ,2 ]
Wang, Shanshan [1 ,2 ]
Peng, Lizhi [1 ,2 ]
Wang, Lin [1 ,2 ]
Yang, Bo [2 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, Jinan 250022, Shandong, Peoples R China
[2] Shandong Prov Key Lab Network Based Intelligent C, Jinan 250022, Shandong, Peoples R China
[3] Univ Nebraska, Lincoln, NE 68588 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Network traffic; Malicious apps; Imbalanced data; Malware detection; Machine learning; CLASSIFICATION; PERFORMANCE; CLASSIFIERS; SMOTE;
D O I
10.1016/j.ins.2017.04.044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, the number and variety of malicious mobile apps have increased drastically, especially on Android platform, which brings insurmountable challenges for malicious app detection. Researchers endeavor to discover the traces of malicious apps using network traffic analysis. In this study, we combine network traffic analysis with machine learning methods to identify malicious network behavior, and eventually to detect malicious apps. However, most network traffic generated by malicious apps is benign, while only a small portion of traffic is malicious, leading to an imbalanced data problem when the traffic model skews towards modeling the benign traffic. To address this problem, we introduce imbalanced classification methods, including the synthetic minority oversampling technique (SMOTE) + support vector machine (SVM), SVM cost-sensitive (SVMCS), and C4.5 cost-sensitive (C4.5CS) methods. However, when the imbalance rate reaches a certain threshold, the performance of common imbalanced classification algorithms degrades significantly. To avoid performance degradation, we propose to use the imbalanced data gravitation-based classification (IDGC) algorithm to classify imbalanced data. Moreover, we develop a simplex imbalanced data gravitation classification (S-IDGC) model to further reduce the time costs of IDGC without sacrificing the classification performance. In addition, we propose a machine learning based comparative benchmark prototype system, which provides users with substantial autonomy, such as multiple choices of the desired classifiers or traffic features. Using this prototype system, users can compare the detection performance of different classification algorithms on the same data set, as well as the performance of a specific classification algorithm on multiple data sets. (C) 2017 Published by Elsevier Inc.
引用
收藏
页码:346 / 364
页数:19
相关论文
共 50 条
  • [1] Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning
    Liu, Lan
    Wang, Pengcheng
    Lin, Jun
    Liu, Langzhou
    [J]. IEEE ACCESS, 2021, 9 : 7550 - 7563
  • [2] Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning
    Liu, Lan
    Wang, Pengcheng
    Lin, Jun
    Liu, Langzhou
    [J]. IEEE Access, 2021, 9 : 7550 - 7563
  • [3] Malware Detection Using Network Traffic Analysis in Android Based Mobile Devices
    Arora, Anshul
    Garg, Shree
    Peddoju, Sateesh K.
    [J]. 2014 EIGHTH INTERNATIONAL CONFERENCE ON NEXT GENERATION MOBILE APPS, SERVICES AND TECHNOLOGIES (NGMAST), 2014, : 66 - 71
  • [4] Real time malware detection in encrypted network traffic using machine learning with time based features
    Singh, Abhay Pratap
    Singh, Mahendra
    [J]. JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2023, 26 (03): : 841 - 850
  • [5] Finding Android Malware Trace From Highly Imbalanced Network Traffic
    Pang, Ying
    Chen, Zhenxiang
    Li, Xiaomei
    Wang, Shanshan
    Zhao, Chuan
    Wang, Lin
    Ji, Ke
    Li, Zicong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE) AND IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC), VOL 1, 2017, : 588 - 595
  • [6] A Survey on Mobile Malware Detection Methods using Machine Learning
    Kambar, Mina Esmail Zadeh Nojoo
    Esmaeilzadeh, Armin
    Kim, Yoohwan
    Taghva, Kazem
    [J]. 2022 IEEE 12TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2022, : 215 - 221
  • [7] High Accuracy Detection of Mobile Malware Using Machine Learning
    Yerima, Suleiman Y.
    [J]. ELECTRONICS, 2023, 12 (06)
  • [8] A Fast and Effective Detection of Mobile Malware Behavior Using Network Traffic
    Liu, Anran
    Chen, Zhenxiang
    Wang, Shanshan
    Peng, Lizhi
    Zhao, Chuan
    Shi, Yuliang
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT IV, 2018, 11337 : 109 - 120
  • [9] A mobile malware detection method using behavior features in network traffic
    Wang, Shanshan
    Chen, Zhenxiang
    Yan, Qiben
    Yang, Bo
    Peng, Lizhi
    Jia, Zhongtian
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 133 : 15 - 25
  • [10] Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic
    Abdulhammed, Razan
    Faezipour, Miad
    Abuzneid, Abdelshakour
    AbuMallouh, Arafat
    [J]. IEEE SENSORS LETTERS, 2019, 3 (01)