Text-based Malicious Domain Names Detection Based on Variational Autoencoder And Supervised Learning

被引:0
|
作者
Sun, Yuwei [1 ]
Chong, Ng S. T. [2 ]
Ochiai, Hideya [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
[2] United Nations Univ, Campus Comp Ctr, Tokyo, Japan
关键词
malicious domain names detection; VAE; cybersecurity; machine learning;
D O I
10.1109/CISS48834.2020.1570601577
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of information technology, adaptation of an information system in industries and institutes has become more and more common. However, attacks like using zombie networks to access a host thus causing it to shut down are frequent in recent years. Domain names play a significant role in the connection with a server, considered as a key for detecting these attacks. In this paper, we propose a text-based method to convert domain names into numeric features, based on the term frequency and inverse document frequency (TF-IDF). Then we adopt the variational autoencoder (VAE) consisting of an encoder and a decoder, extracting hidden information from features. Moreover, through collapsing the Gaussian distribution of these features at the hidden layer to its mean, the distribution of domain names is visualized. After that, we adopt a supervised learning called Convolutional Neural Network (CNN) for the classification between the malicious and benign. We train the model using feature vectors from the VAE. At last, the scheme achieves a validation accuracy of 0.868 for the malicious domain names detection.
引用
收藏
页码:192 / 196
页数:5
相关论文
共 50 条
  • [1] Prioritized Active Learning for Malicious URL Detection using Weighted Text-Based Features
    Das Bhattacharjee, Sreyasee
    Talukder, Ashit
    Al-Shaer, Ehab
    Doshi, Pratik
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 107 - 112
  • [2] Malicious Domain Detection Based on Self-supervised HGNNs with Contrastive Learning
    Li, Zhiping
    Yuan, Fangfang
    Cao, Cong
    Su, Majing
    Lu, Yuhai
    Liu, Yanbing
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT III, 2023, 14256 : 62 - 73
  • [3] Malicious domain detection based on semi-supervised learning and parameter optimization
    Liao, Renjie
    Wang, Shuo
    [J]. IET COMMUNICATIONS, 2024, 18 (06) : 386 - 397
  • [4] Text-based Language Identification of Multilingual Names
    Giwa, Oluwapelumi
    Davel, Marelie H.
    [J]. PROCEEDINGS OF THE 2015 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2015, : 166 - 171
  • [5] Detection of malicious domain names based on an improved hidden Markov model
    Tang, Hengliang
    Dong, Chengang
    [J]. International Journal of Wireless and Mobile Computing, 2019, 16 (01): : 58 - 65
  • [6] Malicious Domain Names Detection Algorithm Based on N-Gram
    Zhao, Hong
    Chang, Zhaobin
    Bao, Guangbin
    Zeng, Xiangyan
    [J]. JOURNAL OF COMPUTER NETWORKS AND COMMUNICATIONS, 2019, 2019
  • [7] A Supervised Approach for Spam Detection Using Text-Based Semantic Representation
    Saidani, Nadjate
    Adi, Kamel
    Allili, Mouhand Said
    [J]. E-TECHNOLOGIES: EMBRACING THE INTERNET OF THINGS, MCETECH 2017, 2017, 289 : 136 - 148
  • [8] Adopting Machine Learning to Support the Detection of Malicious Domain Names
    Magalhaes, Fernanda
    Magalhaes, Joao Paulo
    [J]. 2020 7TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS: SYSTEMS, MANAGEMENT AND SECURITY (IOTSMS), 2020,
  • [9] A Semi-supervised Learning Based on Variational Autoencoder for Visual-Based Robot Localization
    Liang, Kaiyun
    He, Fazhi
    Zhu, Yuanyuan
    Gao, Xiaoxin
    [J]. COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT I, 2022, 1491 : 615 - 627
  • [10] A Semi-supervised Deep Learning-Based Solver for Breaking Text-Based CAPTCHAs
    Deng, Xianwen
    Zhao, Ruijie
    Xue, Zhi
    Liu, Ming
    Chen, Libo
    Wang, Yijun
    [J]. 2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 614 - 619