CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection

被引:55
|
作者
Ye, Yanfang [1 ]
Li, Tao [2 ]
Jiang, Qingshan [3 ]
Wang, Youyu [3 ]
机构
[1] Xiamen Univ, Dept Comp Sci, Xiamen 361005, Peoples R China
[2] Florida Int Univ, Sch Comp Sci, Miami, FL 33199 USA
[3] Xiamen Univ, Software Sch, Xiamen 361005, Peoples R China
基金
美国国家科学基金会;
关键词
Associative classification; malware detection; postprocessing; rule pruning; rule ranking; rule selection;
D O I
10.1109/TSMCC.2009.2037978
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Malware is software designed to infiltrate or damage a computer system without the owner's informed consent (e. g., viruses, backdoors, spyware, trojans, and worms). Nowadays, numerous attacks made by the malware pose a major security threat to computer users. Unfortunately, along with the development of the malware writing techniques, the number of file samples that need to be analyzed, named "gray list,"on a daily basis is constantly increasing. In order to help our virus analysts, quickly and efficiently pick out the malicious executables from the "gray list," an automatic and robust tool to analyze and classify the file samples is needed. In our previous work, we have developed an intelligent malware detection system (IMDS) by adopting associative classification method based on the analysis of application programming interface (API) execution calls. Despite its good performance in malware detection, IMDS still faces the following two challenges: 1) handling the large set of the generated rules to build the classifier; and 2) finding effective rules for classifying new file samples. In this paper, we first systematically evaluate the effects of the postprocessing techniques (e. g., rule pruning, rule ranking, and rule selection) of associative classification in malware detection, and then, propose an effective way, i.e., CIDCPF, to detect the malware from the "gray list." To the best of our knowledge, this is the first effort on using postprocessing techniques of associative classification in malware detection. CIDCPF adapts the postprocessing techniques as follows: first applying Chi-square testing and Insignificant rule pruning followed by using Database coverage based on the Chi-square measure rule ranking mechanism and Pessimistic error estimation, and finally performing prediction by selecting the best First rule. We have incorporated the CIDCPF method into our existing IMDS system, and we call the new system as CIMDS system. Case studies are performed on the large collection of file samples obtained from the Antivirus Laboratory at Kingsoft Corporation and promising experimental results demonstrate that the efficiency and ability of detecting malware from the "gray list" of our CIMDS system outperform popular antivirus software tools, such as McAfee VirusScan and Norton AntiVirus, as well as previous data-mining-based detection systems, which employed Naive Bayes, support vector machine, and decision tree techniques. In particular, our CIMDS system can greatly reduce the number of generated rules, which makes it easy for our virus analysts to identify the useful ones.
引用
收藏
页码:298 / 307
页数:10
相关论文
共 50 条
  • [1] Associative Classification and Post-processing Techniques used for Malware Detection
    Ye, Yanfang
    Jiang, Qingshan
    Zhuang, Weiwei
    [J]. 2008 2ND INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY AND IDENTIFICATION, 2008, : 276 - +
  • [2] Study on Machine Learning Techniques for Malware Classification and Detection
    Moon, Jaewoong
    Kim, Subin
    Song, Jaeseung
    Kim, Kyungshin
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4308 - 4325
  • [3] Adapting Associative Classification to Text Categorization
    Li, Baoli
    Sugandh, Neha
    Garcia, Ernest V.
    Ram, Ashwin
    [J]. DOCENG'07: PROCEEDINGS OF THE 2007 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2007, : 205 - 207
  • [4] Malware Detection Using Nonparametric Bayesian Clustering and Classification Techniques
    Kao, Yimin
    Reich, Brian
    Storlie, Curtis
    Anderson, Blake
    [J]. TECHNOMETRICS, 2015, 57 (04) : 535 - 546
  • [5] A Comparative Analysis of Machine Learning Techniques for Classification and Detection of Malware
    Al-Janabi, Maryam
    Altamimi, Ahmad Mousa
    [J]. 2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
  • [6] Parameter Optimization of Classification Techniques for PDF based Malware Detection
    Hossain, Sm Mukbul
    Ayub, Md Ahsan
    [J]. 2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [7] Performance comparison of visualization-based malware detection and classification techniques
    Shah, Syed Shakir Hameed
    Jamil, Norziana
    Khan, Atta Ur Rehman
    [J]. 2022 17TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET'22), 2022, : 200 - 205
  • [8] The Use of Machine Learning Techniques to Advance the Detection and Classification of Unknown Malware
    Shhadat, Ihab
    Bataineh, Bara'
    Hayajneh, Amena
    Al-Sharif, Ziad A.
    [J]. 11TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 3RD INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2020, 170 : 917 - 922
  • [9] Tools & Techniques for Malware Analysis and Classification
    Gandotra, Ekta
    Bansal, Divya
    Sofat, Sanjeev
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2016, 7 (03): : 176 - 197
  • [10] Pattern Recognition Techniques for the Classification of Malware Packers
    Sun, Li
    Versteeg, Steven
    Boztas, Serdar
    Yann, Trevor
    [J]. INFORMATION SECURITY AND PRIVACY, 2010, 6168 : 370 - +