Authorship Attribution of Android Apps

被引:12
|
作者
Gonzalez, Hugo [1 ]
Stakhanova, Natalia [2 ]
Ghorbani, Ali A. [2 ]
机构
[1] Polytech Univ San Luis Potosi, San Luis Potosi, San Luis Potosi, Mexico
[2] Univ New Brunswick, Fac Comp Sci, Fredericton, NB, Canada
关键词
Android; authorship attribution; suspicious authors;
D O I
10.1145/3176258.3176322
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Since the first computer virus hit the Advanced Research Projects Agency Network (ARPANET) in the early 1970s, the security community interest revolved around ways to expose the identities of malware writers. Knowledge of the adversarial identities promised additional leverage to security experts in their ongoing battle against those perpetrators. At the dawn of computing era, when malware writers and malicious software were characterized by the lack of experience and relative simplicity, the task of uncovering the identities of virus writers was more or less straightforward. Manual analysis of source code often revealed personal, identifiable information embedded by authors themselves. But these times have long gone. Modern days malware writers extensively use numerous malware code generators to mass produce new variants and employ advanced obfuscation techniques to hide their identities. As a result the work of security experts trying to uncover the identities of malware writers became significantly more challenging and time consuming. To gain insight into the identity of an adversary, we turn our attention to authorship attribution research, which offers a broad spectrum of techniques for identifying an author of a document, based on the analysis of an author's writing style. Equipped with these methods, we explore attribution of Android binaries and the role of features related to the development process on the determination of Android binary authorship. Within this context, we propose an incremental approach to perform authorship attribution of Android apps. First to a set of known authors and then the generation of new profiles for unknown apps. We assess the effectiveness of our approach on several sets of malicious and legitimate Android binaries produced by actual developers, as opposed to using artificially created authors' data. We achieve 97.5% accuracy on these authors' data. We further evaluate our approach on more than 131,000 apps collected from various sources including 10 different markets around the globe.
引用
收藏
页码:277 / 286
页数:10
相关论文
共 50 条
  • [1] Android authorship attribution through string analysis
    Kalgutkar, Vaibhavi
    Stakhanova, Natalia
    Cook, Paul
    Matyukhina, Alina
    [J]. 13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [2] AppAuth: Authorship Attribution for Android App Clones
    Xu, Guoai
    Zhang, Chengpeng
    Sun, Bowen
    Yang, Xinyu
    Guo, Yanhui
    Li, Chengze
    Wang, Haoyu
    [J]. IEEE ACCESS, 2019, 7 : 141850 - 141867
  • [3] Android Authorship Attribution Using Source Code-Based Features
    Aydogan, Emre
    Sen, Sevil
    [J]. IEEE ACCESS, 2024, 12 : 6569 - 6589
  • [4] AUTHORSHIP ATTRIBUTION
    HOLMES, DI
    [J]. COMPUTERS AND THE HUMANITIES, 1994, 28 (02): : 87 - 106
  • [5] Versification and Authorship Attribution
    Gomez Camelo, Laura Camila
    Munoz Landinez, Valeria
    [J]. LITERATURA-TEORIA HISTORIA CRITICA, 2023, 25 (02): : 308 - 315
  • [6] Authorship attribution in the wild
    Moshe Koppel
    Jonathan Schler
    Shlomo Argamon
    [J]. Language Resources and Evaluation, 2011, 45 : 83 - 94
  • [7] Championing authorship attribution
    不详
    [J]. NATURE CELL BIOLOGY, 2017, 19 (06) : 579 - 579
  • [8] Authorship Attribution and Pastiche
    Harold Somers
    Fiona Tweedie
    [J]. Computers and the Humanities, 2003, 37 : 407 - 429
  • [9] Authorship Attribution System
    Marchenko, Oleksandr
    Anisimov, Anatoly
    Nykonenko, Andrii
    Rossada, Tetiana
    Melnikov, Egor
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 227 - 231
  • [10] Authorship attribution and pastiche
    Somers, H
    Tweedie, F
    [J]. COMPUTERS AND THE HUMANITIES, 2003, 37 (04): : 407 - 429