Malware Detection and Classification Based on n-grams Attribute Similarity

被引:27
|
作者
Zhang Fuyong [1 ]
Zhao Tiezhou [1 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Network Secur, Dongguan, Peoples R China
基金
中国国家自然科学基金;
关键词
malware detection; attribute similarity; machine learning; unknown malware; static analysis;
D O I
10.1109/CSE-EUC.2017.157
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Unknown malware has increased dramatically, but the existing security software cannot identify them effectively. In this paper, we propose a new malware detection and classification method based on n-grams attribute similarity. We extract all n-grams of byte codes from training samples and select the most relevant as attributes. After calculating the average value of attributes in malware and benign separately, we determine a test sample is malware or benign by attribute similarity between attributes of the test sample and the two average attributes of malware and benign. We compare our method with a variety of machine learning methods, including Naive Bayes, Bayesian Networks, Support Vector Machine and C4.5 Decision Tree. Experimental results on public (Open Malware Benchmark) and private (self-collected) datasets both reveal that our method outperforms the other four methods.
引用
收藏
页码:793 / 796
页数:4
相关论文
共 50 条
  • [21] Classification of Metamorphic Virus Using N-Grams Signatures
    Hamid, Isredza Rahmi A.
    Sani, Nur Sakinah Md
    Abdullah, Zubaile
    Foozy, Cik Feresa Mohd
    Kipli, Kuryati
    [J]. RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING (SCDM 2020), 2020, 978 : 140 - 149
  • [22] The distribution of N-grams
    Egghe, L
    [J]. SCIENTOMETRICS, 2000, 47 (02) : 237 - 252
  • [23] Collocations and N-grams
    FREEBURY-JONES, D. A. R. R. E. N.
    [J]. RENAISSANCE AND REFORMATION, 2021, 44 (04) : 210 - 216
  • [24] Towards an automatic classification of images: Approach by the n-grams
    Laouamer, Lamri
    Biskri, Ismail
    Houmadi, Benamar
    [J]. WMSCI 2005: 9th World Multi-Conference on Systemics, Cybernetics and Informatics, Vol 3, 2005, : 73 - 78
  • [25] Composer classification using melodic combinatorial n-grams
    Alvarez, Daniel Alejandro Perez
    Gelbukh, Alexander
    Sidorov, Grigori
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [26] Error Classification Using Automatic Measures Based on n-grams and Edit Distance
    Benko, L'ubomir
    Benkova, Lucia
    Munkova, Dasa
    Munk, Michal
    Shulzenko, Danylo
    [J]. ADVANCED RESEARCH IN TECHNOLOGIES, INFORMATION, INNOVATION AND SUSTAINABILITY, ARTIIS 2022, PT I, 2022, 1675 : 345 - 356
  • [27] N-grams based feature selection and text representation for Chinese text classification
    Department of Computer Science and Engineering, Tongji University, Cao'an Road, 4800, Shanghai, 201804, China
    不详
    不详
    [J]. Int. J. Comput. Intell. Syst., 2009, 4 (365-374):
  • [28] N-grams based feature selection and text representation for Chinese Text Classification
    Zhihua Wei
    Duoqian Miao
    Jean Hugues Chauchat
    Rui Zhao
    Wen Li
    [J]. International Journal of Computational Intelligence Systems, 2009, 2 (4) : 365 - 374
  • [29] Feature Extension for Chinese Short Text Classification Based on Topical N-Grams
    Sun, Baoshan
    Zhao, Peng
    [J]. 2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 477 - 482
  • [30] A CNN based approach to Phrase-Labelling through classification of N-Grams
    Choudhary, Chinmay
    O'Riordan, Colm
    [J]. PROCEEDINGS OF THE 11TH ANNUAL MEETING OF THE FORUM FOR INFORMATION RETRIEVAL EVALUATION (FIRE 2019), 2019, : 18 - 23