A Multi-Leveled Approach and its Application in Classifying Malware Programs using Multiple Sources of Telemetry Data

被引:0
|
作者
Djaneye-Boundjou, Ouboti [1 ]
Messay-Kebede, Temesguen [2 ]
Kapp, David [2 ]
机构
[1] Univ Dayton, Dept Elect & Comp Engn, Dayton, OH 45469 USA
[2] WPAFB, Air Force Res Lab, Resilient & Agile Av Branch, Dayton, OH 45433 USA
关键词
D O I
10.1109/NAECON49338.2021.9696417
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
We use features obtained from both compiled and disassembly files, provided for each malware program in the BIG 2015 dataset, to classify the said malware programs. The HEX codes in the compiled files are represented as images and the Latent Dirichlet Allocation (LDA) algorithm is used to represent documents of opcodes, with the opcodes extracted from the disassembly files, as weighted mixtures of LDA generated topics. For classification of the malware programs, we propose a Multi-Layer Perceptron (MLP) based system that consists of two serially connected levels, with each level only accepting one type of feature as input. The first level takes in the malware image features exclusively. Its output and the LDA topic weight features are then fed into the second level, which finally outputs the classification predictions. Using backpropagation, the MLPs at both levels are trained by minimizing the cross entropy loss based on the second level's classification predictions. On a held-out, test set containing 25% of malware programs randomly sampled along class lines from the labeled set of the BIG 2015 dataset, we get a 98.9% classification accuracy using our proposed classification system.
引用
收藏
页码:283 / 287
页数:5
相关论文
共 50 条
  • [1] Approach to Classifying Freight Data Elements Across Multiple Data Sources
    Seedah, Dan P. K.
    Sankaran, Bharathwaj
    O'Brien, William J.
    TRANSPORTATION RESEARCH RECORD, 2015, (2529) : 56 - 65
  • [2] Inverse Gini indexed averaging: A multi-leveled ensemble approach for skin lesion classification using attention-integrated customized ResNet variants
    Efat, Anwar Hossain
    Hasan, S. M. Mahedy
    Uddin, Md Palash
    Emon, Faysal Hossain
    DIGITAL HEALTH, 2025, 11
  • [3] A Data-Driven Multi-Fidelity Approach for Traffic State Estimation Using Data From Multiple Sources
    Alemazkoor, Negin
    Meidani, Hadi
    IEEE ACCESS, 2021, 9 : 78128 - 78137
  • [4] Classification of Malware Programs using Autoencoders based Deep Learning Architecture and its Application to the Microsoft Malware Classification Challenge (BIG 2015) Dataset
    Kebede, Temesguen Messay
    Djaneye-Boundjou, Ouboti
    Narayanan, Barath Narayanan
    Ralescu, Anca
    Kapp, David
    2017 IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE (NAECON), 2017, : 70 - 75
  • [5] Epidemiological cluster identification using multiple data sources: an approach using logistic regression
    Susvitasari, Kurnia
    Tupper, Paul F.
    Cancino-Munos, Irving
    Lopez, Mariana G.
    Comas, Inaki
    Colijn, Caroline
    MICROBIAL GENOMICS, 2023, 9 (03):
  • [6] ELM-NET, a closer to practice approach for classifying the big data using multiple independent ELMs
    Shokrzade, Amin
    Tab, Fardin Akhlaghian
    Ramezani, Mohsen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (02): : 735 - 757
  • [7] ELM-NET, a closer to practice approach for classifying the big data using multiple independent ELMs
    Amin Shokrzade
    Fardin Akhlaghian Tab
    Mohsen Ramezani
    Cluster Computing, 2020, 23 : 735 - 757
  • [8] Generating Recommendations From Multiple Data Sources: A Methodological Framework for System Design and Its Application
    Cena, Federica
    Rapp, Amon
    Musto, Cataldo
    Semeraro, Giovanni
    IEEE ACCESS, 2020, 8 : 183430 - 183447
  • [9] A Requirements Based Approach for Automating Enterprise IT Architecture Modeling Using Multiple Data Sources
    Valja, Margus
    Lagerstrm, Robert
    Ekstedt, Mathias
    Korman, Matus
    PROCEEDINGS OF THE 2015 IEEE 19TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE WORKSHOPS AND DEMONSTRATIONS (EDOCW 2015), 2015, : 79 - 87
  • [10] Multi-Space Competitive DGA for Model Selection and its Application to Localization of Multiple Signal Sources
    Ishikawa, Shudai
    Misawa, Hideaki
    Kubota, Ryosuke
    Tokiwa, Tatsuji
    Horio, Keiichi
    Yamakawa, Takeshi
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (09) : 1320 - 1328