Predicting the Impact of Android Malicious Samples via Machine Learning

被引:12
|
作者
Qiu, Junyang [1 ]
Luo, Wei [1 ]
Pan, Lei [1 ]
Tai, Yonghang [2 ]
Zhang, Jun [3 ]
Xiang, Yang [3 ]
机构
[1] Deakin Univ, Sch Informat Technol, Geelong, Vic 3216, Australia
[2] Yunnan Normal Univ, Sch Phys & Elect Informat, Kunming 650500, Yunnan, Peoples R China
[3] Swinburne Univ Technol, Sch Software & Elect Engn, Melbourne, Vic 3122, Australia
关键词
Android malware; deep neural network; high impact malicious samples; low impact malicious samples; static analysis; SVM; NEURAL-NETWORKS;
D O I
10.1109/ACCESS.2019.2914311
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, Android malicious samples threaten billions of mobile end users' security or privacy. The community researchers have designed many methods to automatically and accurately identify Android malware samples. However, the rapid increase of Android malicious samples outpowers the capabilities of traditional Android malware detectors and classifiers with respect to the cyber security risk management needs. It is important to identify the small proportion of Android malicious samples that may produce high cyber-security or privacy impact. In this paper, we propose a light-weight solution to automatically identify the Android malicious samples with high security and privacy impact. We manually check a number of Android malware families and corresponding security incidents and define two impact metrics for Android malicious samples. Our investigation results in a new Android malware dataset with impact ground truth (low impact or high impact). This new dataset is employed to empirically investigate the intrinsic characteristics of low-impact as well as high-impact malicious samples. To characterize and capture Android malicious samples' pattern, reverse engineering is performed to extract semantic features to represent malicious samples. The leveraged features are parsed from both the AndroidManifest.xml files as well as the disassembled binary classes.dex codes. Then, the extracted features are embedded into numerical vectors. Furthermore, we train highly accurate support vector machine and deep neural network classifiers to categorize the candidate Android malicious samples into low impact or high impact. The empirical results validate the effectiveness of our designed light-weight solution. This method can be further utilized for identifying those high-impact Android malicious samples in the wild.
引用
收藏
页码:66304 / 66316
页数:13
相关论文
共 50 条
  • [11] Detecting Malicious Ethereum Entities via Application of Machine Learning Classification
    Poursafaei, Farimah
    Hamad, Ghaith Bany
    Zilic, Zeljko
    2020 2ND CONFERENCE ON BLOCKCHAIN RESEARCH & APPLICATIONS FOR INNOVATIVE NETWORKS AND SERVICES (BRAINS), 2020, : 120 - 127
  • [12] Predicting MXene Properties via Machine Learning
    Vertina, Eric W.
    Deskins, N. Aaron
    Sutherland, Emily
    Mangoubi, Oren
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1573 - 1578
  • [13] Permission and API Calls Based Hybrid Machine Learning Approach for Detecting Malicious Software in Android System
    Prabhavathy, M.
    Maheswari, S. Uma
    Saveeth, R.
    Rubini, S. Saranya
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2021, 37 (5-6) : 553 - 571
  • [14] Predicting Malicious Software in IoT Environment Based on Machine Learning and Data Mining Techniques
    Alharbi, Abdulmohsen
    Hamid, Abdul
    Lahza, Husam
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 497 - 506
  • [15] Detecting Malicious Driving with Machine Learning
    Yardy, Kevin
    Almehmadi, Abdulaziz
    El-Khatib, Khalil
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
  • [16] Predicting hydrogen storage in MOFs via machine learning
    Ahmed, Alauddin
    Siegel, Donald J.
    PATTERNS, 2021, 2 (07):
  • [17] Predicting Corporate Bond Illiquidity via Machine Learning
    Cabrol, Axel
    Drobetz, Wolfgang
    Otto, Tizian
    Puhan, Tatjana
    FINANCIAL ANALYSTS JOURNAL, 2024, 80 (03) : 103 - 127
  • [18] Predicting Phylogenetic Bootstrap Values via Machine Learning
    Wiegert, Julius
    Hoehler, Dimitri
    Haag, Julia
    Stamatakis, Alexandros
    MOLECULAR BIOLOGY AND EVOLUTION, 2024, 41 (10)
  • [19] On predicting research grants productivity via machine learning
    Tohalino, Jorge A. V.
    Amancio, Diego R.
    JOURNAL OF INFORMETRICS, 2022, 16 (02)
  • [20] Analysis of Ransomware Impact on Android Systems using Machine Learning Techniques
    Al-Ruwili, Anfal Sayer M.
    Mostafa, Ayman Mohamed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 775 - 785