Disentangled Representation Learning in Heterogeneous Information Network for Large-scale Android Malware Detection in the COVID-19 Era and Beyond

被引:0
|
作者
Hou, Shifu [1 ]
Fan, Yujie [1 ]
Ju, Mingxuan [1 ]
Ye, Yanfang [1 ]
Wan, Wenqiang [2 ]
Wang, Kui [2 ]
Mei, Yinming [2 ]
Xiong, Qi [2 ]
Shao, Fudong [2 ]
机构
[1] Case Western Reserve Univ, Dept Comp & Data Sci, Cleveland, OH 44106 USA
[2] Tencent, Tencent Secur Lab, Shenzhen, Guangdong, Peoples R China
来源
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2021年 / 35卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the fight against the COVID-19 pandemic, many social activities have moved online; society's overwhelming reliance on the complex cyberspace makes its security more important than ever. In this paper, we propose and develop an intelligent system named Dr.HIN to protect users against the evolving Android malware attacks in the COVID-19 era and beyond. In Dr.HIN, besides app content, we propose to consider higher-level semantics and social relations among apps, developers and mobile devices to comprehensively depict Android apps; and then we introduce a structured heterogeneous information network (HIN) to model the complex relations and exploit meta-path guided strategy to learn node (i.e., app) representations from HIN. As the representations of malware could be highly entangled with benign apps in the complex ecosystem of development, it poses a new challenge of learning the latent explanatory factors hidden in the HIN embeddings to detect the evolving malware. To address this challenge, we propose to integrate domain priors generated from different views (i.e., app content, app authorship, app installation) to devise an adversarial disentangler to separate the distinct, informative factors of variations hidden in the HIN embeddings for large-scale Android malware detection. This is the first attempt of disentangled representation learning in HIN data. Promising experimental results based on real sample collections from security industry demonstrate the performance of Dr.HIN in evolving Android malware detection, by comparison with baselines and popular mobile security products.
引用
收藏
页码:7754 / 7761
页数:8
相关论文
共 50 条
  • [31] Learning from Large-Scale Wearable Device Data for Predicting Epidemics Trend of COVID-19
    Zhu, Guokang
    Li, Jia
    Meng, Zi
    Yu, Yi
    Li, Yanan
    Tang, Xiao
    Dong, Yuling
    Sun, Guangxin
    Zhou, Rui
    Wang, Hui
    Wang, Kongqiao
    Huang, Wang
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2020, 2020
  • [32] Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS
    Prieto-Alhambra, Daniel
    Kostka, Kristin
    Duarte-Salles, Talita
    Prats-Uribe, Albert
    Sena, Anthony
    Pistillo, Andrea
    Khalid, Sara
    Lai, Lana
    Golozar, Asieh
    Alshammari, Thamir M.
    Dawoud, Dalia
    Nyberg, Fredrik
    Wilcox, Adam
    Andryc, Alan
    Williams, Andrew
    Ostropolets, Anna
    Areia, Carlos
    Jung, Chi Young
    Harle, Christopher
    Reich, Christian
    Blacketer, Clair
    Morales, Daniel
    Dorr, David A.
    Burn, Edward
    Roel, Elena
    Tan, Eng Hooi
    Minty, Evan
    DeFalco, Frank
    de Maeztu, Gabriel
    Lipori, Gigi
    Alghoul, Heba
    Zhu, Hong
    Thomas, Jason
    Bian, Jiang
    Park, Jimyung
    Roldan, Jordi Martinez
    Posada, Jose
    Banda, Juan M.
    Horcajada, Juan P.
    Kohler, Julianna
    Shah, Karishma
    Natarajan, Karthik
    Lynch, Kristine
    Liu, Li
    Schilling, Lisa
    Recalde, Martina
    Spotnitz, Matthew
    Gong, Mengchun
    Matheny, Michael
    Valveny, Neus
    CLINICAL EPIDEMIOLOGY, 2022, 14 : 369 - 384
  • [33] Rapid, large-scale, and effective detection of COVID-19 via non-adaptive testing
    Taufer, Matthias
    JOURNAL OF THEORETICAL BIOLOGY, 2020, 506
  • [34] Optimizing Signal Detection for Large Scale COVID-19 Vaccination
    Van Hunsel, F.
    Oosterhuis, I.
    van Puijenbroek, E.
    DRUG SAFETY, 2021, 44 (12) : 1416 - 1417
  • [35] Using Large-scale Heterogeneous Graph Representation Learning for Code Review Recommendations at Microsoft
    Zhang, Jiyang
    Maddila, Chandra
    Bairi, Ram
    Bird, Christian
    Raizada, Ujjwal
    Agrawal, Apoorva
    Jhawar, Yamini
    Herzig, Kim
    van Deursen, Arie
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE, ICSE-SEIP, 2023, : 162 - 172
  • [36] GNNVis: Visualize Large-Scale Data by Learning a Graph Neural Network Representation
    Huang, Yajun
    Zhang, Jingbin
    Yang, Yiyang
    Gong, Zhiguo
    Hao, Zhifeng
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 545 - 554
  • [37] A Large-Scale Empirical Study of COVID-19 Themed GitHub Repositories
    Wang, Liu
    Li, Ruiqing
    Zhu, Jiaxin
    Bai, Guangdong
    Wang, Haoyu
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 914 - 923
  • [38] Insights from a Large-Scale Discussion on COVID-19 in Collective Intelligence
    Haqbeen, Jawad
    Ito, Takayuki
    Sahab, Sofia
    Sato, Takumi
    Okuhara, Shun
    Hofiani, Murtaza
    2020 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2020), 2020, : 546 - 553
  • [39] Large-Scale Multi-omic Analysis of COVID-19 Severity
    Overmyer, Katherine A.
    Shishkova, Evgenia
    Miller, Ian J.
    Balnis, Joseph
    Bernstein, Matthew N.
    Peters-Clarke, Trenton M.
    Meyer, Jesse G.
    Quan, Qiuwen
    Muehlbauer, Laura K.
    Trujillo, Edna A.
    He, Yuchen
    Chopra, Amit
    Chieng, Hau C.
    Tiwari, Anupama
    Judson, Marc A.
    Paulson, Brett
    Brademan, Dain R.
    Zhu, Yunyun
    Serrano, Lia R.
    Linke, Vanessa
    Drake, Lisa A.
    Adam, Alejandro P.
    Schwartz, Bradford S.
    Singer, Harold A.
    Swanson, Scott
    Mosher, Deane F.
    Stewart, Ron
    Coon, Joshua J.
    Jaitovich, Ariel
    CELL SYSTEMS, 2021, 12 (01) : 23 - +
  • [40] A large-scale analysis of Persian Tweets regarding Covid-19 vaccination
    ShabaniMirzaei, Taha
    Chamani, Houmaan
    Abaskohi, Amirhossein
    Zadeh, Zhivar Sourati Hassan
    Bahrak, Behnam
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 13 (01)