Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning

被引:108
|
作者
Mahdavifar, Samaneh [1 ]
Kadir, Andi Fitriah Abdul [2 ]
Fatemi, Rasool [1 ]
Alhadidi, Dima [3 ]
Ghorbani, Ali A. [1 ]
机构
[1] Univ New Brunswick, Fac Comp Sci, Canadian Inst Cybersecur CIC, Fredericton, NB, Canada
[2] Int Islamic Univ Malaysia, Kulliyyah Informat & Commun Technol, Kuala Lumpur, Malaysia
[3] Univ Windsor, Sch Comp Sci, Windsor, ON, Canada
关键词
Malware; Category Classification; Android; Dynamic Analysis; Semi-Supervised Learning; Deep Learning; Dynamic Behavior Profiles;
D O I
10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00094
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the significant threat of Android mobile malware, its detection has become increasingly important. Despite the academic and industrial attempts, devising a robust and efficient solution for Android malware detection and category classification is still an open problem. Supervised machine learning has been used to solve this issue. However, it is far to be perfect because it requires a significant amount of malicious and benign code to be identified and labeled beforehand. Since labeled data is expensive and difficult to get while unlabeled data is abundant and cheap in this context, we resort to a semi-supervised learning technique for deep neural networks, namely pseudo-label, which we train using a set of labeled and unlabeled instances. We use dynamic analysis to craft dynamic behavior profiles as feature vectors. Furthermore, we develop a new dataset, namely CICMalDroid2020, which includes 17,341 most recent samples of five different Android apps categories: Adware, Banking, SMS, Riskware, and Benign. Our offered dataset comprises the most complete captured static and dynamic features among publicly available datasets. We evaluate our proposed model on CICMalDroid2020 and conduct a comparison with Label Propagation (LP), a well-known semi-supervised machine learning technique, and other common machine learning algorithms. The experimental results show that the model can classify Android apps with respect to malware category with F1-Score of 97.84 percent and a false positive rate of 2.76 percent, considerably higher than LP. These results demonstrate the robustness of our model despite the small number of labeled instances.
引用
收藏
页码:515 / 522
页数:8
相关论文
共 50 条
  • [1] POSTER: Semi-supervised Classification for Dynamic Android Malware Detection
    Chen, Li
    Zhang, Mingwei
    Yang, Chih-yuan
    Sahita, Ravi
    [J]. CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, : 2479 - 2481
  • [2] Malware Classification Based on Semi-Supervised Learning
    Ding, Yu
    Zhang, XiaoYu
    Li, BinBin
    Xing, Jian
    Qiang, Qian
    Qi, ZiSen
    Guo, MengHan
    Jia, SiYu
    Wang, HaiPing
    [J]. SCIENCE OF CYBER SECURITY, SCISEC 2022, 2022, 13580 : 287 - 301
  • [3] An adaptive semi-supervised deep learning-based framework for the detection of Android malware
    Wajahat, Ahsan
    He, Jingsha
    Zhu, Nafei
    Mahmood, Tariq
    Nazir, Ahsan
    Pathan, Muhammad Salman
    Qureshi, Sirajuddin
    Ullah, Faheem
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (03) : 5141 - 5157
  • [4] A Novel Malware Traffic Classification Method using Semi-Supervised Learning
    Ning, Jinhui
    Wang, Yu
    Yang, Jie
    Gacanin, Haris
    Ci, Song
    [J]. 2021 IEEE 94TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2021-FALL), 2021,
  • [5] Malware classification for the cloud via semi-supervised transfer learning
    Gao, Xianwei
    Hu, Changzhen
    Shan, Chun
    Liu, Baoxu
    Niu, Zequn
    Xie, Hui
    [J]. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2020, 55
  • [6] Deep graph learning for semi-supervised classification
    Lin, Guangfeng
    Kang, Xiaobing
    Liao, Kaiyang
    Zhao, Fan
    Chen, Yajun
    [J]. PATTERN RECOGNITION, 2021, 118
  • [7] Improving Colonoscopy Lesion Classification Using Semi-Supervised Deep Learning
    Golhar, Mayank
    Bobrow, Taylor L.
    Khoshknab, Mirmilad Pourmousavi
    Jit, Simran
    Ngamruengphong, Saowanee
    Durr, Nicholas J.
    [J]. IEEE ACCESS, 2021, 9 : 631 - 640
  • [8] SEMI-SUPERVISED DEEP LEARNING FOR OBJECT TRACKING AND CLASSIFICATION
    Doulamis, Nikolaos
    Doulamis, Anastasios
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 848 - 852
  • [9] Semi-supervised deep learning for hyperspectral image classification
    Kang, Xudong
    Zhuo, Binbin
    Duan, Puhong
    [J]. REMOTE SENSING LETTERS, 2019, 10 (04) : 353 - 362
  • [10] Deep semi-supervised learning for brain tumor classification
    Chenjie Ge
    Irene Yu-Hua Gu
    Asgeir Store Jakola
    Jie Yang
    [J]. BMC Medical Imaging, 20