Dark Hazard: Learning-based, Large-scale Discovery of Hidden Sensitive Operations in Android Apps

被引:28
|
作者
Pan, Xiaorui [1 ]
Wang, Xueqiang [1 ]
Duan, Yue [2 ]
Wang, XiaoFeng [1 ]
Yin, Heng [2 ]
机构
[1] Indiana Univ, Bloomington, IN 47405 USA
[2] Univ Calif Riverside, Riverside, CA 92521 USA
基金
美国国家科学基金会;
关键词
D O I
10.14722/ndss.2017.23265
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hidden sensitive operations (HSO) such as stealing privacy user data upon receiving an SMS message are increasingly utilized by mobile malware and other potentially-harmful apps (PHAs) to evade detection. Identification of such behaviors is hard, due to the challenge in triggering them during an app's runtime. Current static approaches rely on the trigger conditions or hidden behaviors known beforehand and therefore cannot capture previously unknown HSO activities. Also these techniques tend to be computationally intensive and therefore less suitable for analyzing a large number of apps. As a result, our understanding of real-world HSO today is still limited, not to mention effective means to mitigate this threat. In this paper, we present HSOMINER, an innovative machinelearning based program analysis technique that enables a large-scale discovery of unknown HSO activities. Our approach leverages a set of program features that characterize an HSO branch' and can be relatively easy to extract from an app. These features summarize a set of unique observations about an HSO condition, its paths and the relations between them, and are designed to be general for finding hidden suspicious behaviors. Particularly, we found that a trigger condition is less likely to relate to the path of its branch through data flows or shared resources, compared with a legitimate branch. Also, the behaviors exhibited by the two paths of an HSO branch tend to be conspicuously different (innocent on one side and sinister on the other). Most importantly, even though these individual features are not sufficiently accurate for capturing HSO on their own, collectively they are shown to be highly effective in identifying such behaviors. This differentiating power is harnessed by HSOMINER to classify Android apps, which achieves a high precision (>98%) and coverage (>94%), and is also efficient as discovered in our experiments. The new tool was further used in a measurement study involving 338,354 realworld apps, the largest one ever conducted on suspicious hidden operations. Our research brought to light the pervasiveness of HSO activities, which are present in 18.7% of the apps we analyzed, surprising trigger conditions (e.g., click on a certain region of a view) and behaviors (e.g., hiding operations in a dynamically generated receiver), which help better understand the problem and contribute to more effective defense against this new threat to the mobile platform.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Reinforcement Learning-Based Intelligent Task Scheduling for Large-Scale IoT Systems
    Jin C.
    Han Y.
    Deng Z.
    Chen Y.
    Liu C.
    Huang J.
    Wireless Communications and Mobile Computing, 2023, 2023
  • [32] A machine learning-based method for the large-scale evaluation of the qualities of the urban environment
    Liu, Lun
    Silva, Elisabete A.
    Wu, Chunyang
    Wang, Hui
    COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2017, 65 : 113 - 125
  • [33] DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening
    Wan, Fangping
    Zhu, Yue
    Hu, Hailin
    Dai, Antao
    Cai, Xiaoqing
    Chen, Ligong
    Gong, Haipeng
    Xia, Tian
    Yang, Dehua
    Wang, Ming-Wei
    Zeng, Jianyang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (05) : 478 - 495
  • [34] Fully Automated UI Testing System for Large-scale Android Apps Using Multiple Devices
    Ki, Taeyeon
    Simeonov, Alexander
    Park, Chang Min
    Dantu, Karthik
    Ko, Steven Y.
    Ziarek, Lukasz
    MOBISYS'17: PROCEEDINGS OF THE 15TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS, AND SERVICES, 2017, : 185 - 185
  • [35] Deep Learning-Based Classification of Large-Scale Airborne LiDAR Point Cloud
    Turgeon-Pelchat, Mathieu
    Foucher, Samuel
    Bouroubi, Yacine
    CANADIAN JOURNAL OF REMOTE SENSING, 2021, 47 (03) : 381 - 395
  • [36] Learning Hidden Influences in Large-Scale Dynamical Social Networks
    Ravazzi, Chiara
    Dabbene, Fabrizio
    Lagoa, Constantino M.
    Proskurnikov, Anton V.
    IEEE CONTROL SYSTEMS MAGAZINE, 2021, 41 (05): : 61 - 103
  • [37] Latent Structured Perceptrons for Large-Scale Learning with Hidden Information
    Sun, Xu
    Matsuzaki, Takuya
    Li, Wenjie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (09) : 2063 - 2075
  • [38] Attention deep learning-based large-scale learning classifier for Cassava leaf disease classification
    Ravi, Vinayakumar
    Acharya, Vasundhara
    Pham, Tuan D.
    EXPERT SYSTEMS, 2022, 39 (02)
  • [39] Doppelgangers on the Dark Web: A Large-scale Assessment on Phishing Hidden Web Services
    Yoon, Changhoon
    Kim, Kwanwoo
    Kim, Yongdae
    Shin, Seungwon
    Son, Sooel
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 2225 - 2235
  • [40] Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery
    Zhang, Haiping
    Lin, Xiao
    Wei, Yanjie
    Zhang, Huiling
    Liao, Linbu
    Wu, Hao
    Pan, Yi
    Wu, Xuli
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2022, 9