A Comprehensive Study on Efficient and Accurate Machine Learning-Based Malicious PE Detection

被引:2
|
作者
Barut, Onur [1 ]
Zhang, Tong [1 ]
Luo, Yan [2 ]
Li, Peilong [3 ]
机构
[1] Intel Corp, Network & Edge Grp, Santa Clara, CA 95054 USA
[2] Univ Massachusetts Lowell, Dept Elect & Comp Eng, Lowell, MA USA
[3] Elizabethtown Coll, Dept Comp Sci, Elizabethtown, PA USA
关键词
Malware Analysis; Ransomware Detection; Machine Learning; Feature Engineering;
D O I
10.1109/CCNC51644.2023.10060214
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
For safe and trustworthy digital services, fast and accurate malware detection is critical. Because of the financial rewards, ransomware assaults are one of the most commonly employed malware variants by cyber criminals. Because of the dynamic environment in which new malware variants arise on a regular basis, it is critical to maintain databases up-to-date in order to protect the digital world from ransomware threats. In this study, we curated the Ransomary dataset containing 2871 ransomware and 4208 benign PE files to allow researchers to use their own algorithms to accomplish fast and precise detection. We examined the Ransomary dataset and compared feature extraction and raw data techniques of static malware analysis. In the EMBER, DeepDetectNet, and Ransomary datasets, we found that effective feature selection with the LightGBM model can yield more than 0.99 AUC. Finally, we demonstrate that using raw data from the first 1KB of PE files may result in an accurate and extremely rapid response time. We intend to continuously expand Ransomary dataset and encourage more researchers to use static, dynamic, or hybrid analysis to identify ransomware more quickly and accurately.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Supervised Machine Learning-based Fall Detection
    Caya, Meo Vincent C.
    Magwili, Glenn V.
    Agulto, Denver L.
    John Laranang, Russell
    Palomo, Louisse Kayle G.
    2018 IEEE 10TH INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY, COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2018,
  • [42] Machine learning-based detection of chemical risk
    Grabar, Natalia
    Wandji Tchamp, Ornella
    Maxim, Laura
    E-HEALTH - FOR CONTINUITY OF CARE, 2014, 205 : 725 - 729
  • [43] Machine learning-based guilt detection in text
    Meque, Abdul Gafar Manuel
    Hussain, Nisar
    Sidorov, Grigori
    Gelbukh, Alexander
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [44] Machine Learning-Based Detection of Spam Emails
    Bin Siddique, Zeeshan
    Khan, Mudassar Ali
    Din, Ikram Ud
    Almogren, Ahmad
    Mohiuddin, Irfan
    Nazir, Shah
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [45] Machine learning-based intrusion detection algorithms
    Tang, Hua
    Cao, Zhuolin
    Journal of Computational Information Systems, 2009, 5 (06): : 1825 - 1831
  • [46] Machine learning-based guilt detection in text
    Abdul Gafar Manuel Meque
    Nisar Hussain
    Grigori Sidorov
    Alexander Gelbukh
    Scientific Reports, 13
  • [47] MalwD&C: A Quick and Accurate Machine Learning-Based Approach for Malware Detection and Categorization
    Buriro, Attaullah
    Buriro, Abdul Baseer
    Ahmad, Tahir
    Buriro, Saifullah
    Ullah, Subhan
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [48] Accurate machine learning-based germination detection, prediction and quality assessment of three grain crops
    Genze, Nikita
    Bharti, Richa
    Grieb, Michael
    Schultheiss, Sebastian J.
    Grimm, Dominik G.
    PLANT METHODS, 2020, 16 (01)
  • [49] Accurate machine learning-based germination detection, prediction and quality assessment of three grain crops
    Nikita Genze
    Richa Bharti
    Michael Grieb
    Sebastian J. Schultheiss
    Dominik G. Grimm
    Plant Methods, 16
  • [50] Neuro-Detect: A Machine Learning-Based Fast and Accurate Seizure Detection System in the IoMT
    Abu Sayeed, Md
    Mohanty, Saraju P.
    Kougianos, Elias
    Zaveri, Hitten P.
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (03) : 359 - 368