Disentangled Representation Learning in Heterogeneous Information Network for Large-scale Android Malware Detection in the COVID-19 Era and Beyond

被引:0
|
作者
Hou, Shifu [1 ]
Fan, Yujie [1 ]
Ju, Mingxuan [1 ]
Ye, Yanfang [1 ]
Wan, Wenqiang [2 ]
Wang, Kui [2 ]
Mei, Yinming [2 ]
Xiong, Qi [2 ]
Shao, Fudong [2 ]
机构
[1] Case Western Reserve Univ, Dept Comp & Data Sci, Cleveland, OH 44106 USA
[2] Tencent, Tencent Secur Lab, Shenzhen, Guangdong, Peoples R China
来源
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2021年 / 35卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the fight against the COVID-19 pandemic, many social activities have moved online; society's overwhelming reliance on the complex cyberspace makes its security more important than ever. In this paper, we propose and develop an intelligent system named Dr.HIN to protect users against the evolving Android malware attacks in the COVID-19 era and beyond. In Dr.HIN, besides app content, we propose to consider higher-level semantics and social relations among apps, developers and mobile devices to comprehensively depict Android apps; and then we introduce a structured heterogeneous information network (HIN) to model the complex relations and exploit meta-path guided strategy to learn node (i.e., app) representations from HIN. As the representations of malware could be highly entangled with benign apps in the complex ecosystem of development, it poses a new challenge of learning the latent explanatory factors hidden in the HIN embeddings to detect the evolving malware. To address this challenge, we propose to integrate domain priors generated from different views (i.e., app content, app authorship, app installation) to devise an adversarial disentangler to separate the distinct, informative factors of variations hidden in the HIN embeddings for large-scale Android malware detection. This is the first attempt of disentangled representation learning in HIN data. Promising experimental results based on real sample collections from security industry demonstrate the performance of Dr.HIN in evolving Android malware detection, by comparison with baselines and popular mobile security products.
引用
收藏
页码:7754 / 7761
页数:8
相关论文
共 50 条
  • [21] Lessons from a large-scale COVID-19 vaccine trial
    Mahla, Ranjeet Singh
    Dustin, Lynn B.
    JOURNAL OF CLINICAL INVESTIGATION, 2022, 132 (18):
  • [22] A large-scale analysis of COVID-19 tweets in the Arab region
    Aya Mourad
    Shady Elbassuoni
    Social Network Analysis and Mining, 2022, 12
  • [23] Design and analysis of a large-scale COVID-19 tweets dataset
    Lamsal, Rabindra
    APPLIED INTELLIGENCE, 2021, 51 (05) : 2790 - 2804
  • [24] COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter
    Antypas, Dimosthenis
    Rogers, David
    Preece, Alun
    Camacho-Collados, Jose
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 119 - 126
  • [25] Rapid Large-Scale COVID-19 Testing during Shortages
    Beetz, Christian
    Skrahina, Volha
    Foerster, Toni M.
    Gaber, Hanaa
    Paul, Jefri J.
    Curado, Filipa
    Rolfs, Arndt
    Bauer, Peter
    Schaefer, Stephan
    Weckesser, Volkmar
    Lieu, Vivi
    Radefeldt, Mandy
    Poeppel, Claudia
    Krake, Susann
    Kandaswamy, Krishna K.
    Bruesehafer, Katja
    Vogel, Florian
    DIAGNOSTICS, 2020, 10 (07)
  • [26] Large-Scale Scientific Computing in the Fight Against COVID-19
    West, John
    COMPUTING IN SCIENCE & ENGINEERING, 2021, 23 (01) : 89 - 92
  • [27] A large-scale analysis of COVID-19 tweets in the Arab region
    Mourad, Aya
    Elbassuoni, Shady
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [28] Large-scale epidemiological monitoring of the COVID-19 epidemic in Tokyo
    Yoneoka, Daisuke
    Tanoue, Yuta
    Kawashima, Takayuki
    Nomura, Shuhei
    Shi, Shoi
    Eguchi, Akifumi
    Ejima, Keisuke
    Taniguchi, Toshibumi
    Sakamoto, Haruka
    Kunishima, Hiroyuki
    Gilmour, Stuart
    Nishiura, Hiroshi
    Miyata, Hiroaki
    LANCET REGIONAL HEALTH-WESTERN PACIFIC, 2020, 3
  • [29] Deep learning network selection and optimized information fusion for enhanced COVID-19 detection
    Ali, Muhammad Umair
    Zafar, Amad
    Tanveer, Jawad
    Khan, Muhammad Attique
    Kim, Seong Han
    Alsulami, Mashael M.
    Lee, Seung Won
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (02)
  • [30] Edge Representation Learning for Community Detection in Large Scale Information Networks
    Li, Suxue
    Zhang, Haixia
    Wu, Dalei
    Zhang, Chuanting
    Yuan, Dongfeng
    MOBILITY ANALYTICS FOR SPATIO-TEMPORAL AND SOCIAL DATA, MATES 2017, 2018, 10731 : 54 - 72