Detection of Malicious URLs through an Ensemble of Machine Learning Techniques

被引:3
|
作者
Venugopal, Shreya [1 ]
Panale, Shreya Yuvraj [1 ]
Agarwal, Manav [1 ]
Kashyap, Rishab [1 ]
Ananthanagu, U. [1 ]
机构
[1] PES Univ, Dept Comp Sci, South Campus, Bangalore, Karnataka, India
关键词
Malicious URLs and Web Pages; Machine Learning Models; BERT; LSTM; Decision Trees; PHISHING WEBSITES;
D O I
10.1109/CSDE53843.2021.9718370
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper aims to classify URLs and web pages into legitimate and malicious sites to alert users and allow safer browsing through the internet. Through this process we have found various points of interest and attributes that bring to light the characteristics of these malicious sources, allowing us to be aware of and prevent any damage it might cause. These attributes relate to the domain registration of the URLs, the URL text, the structure of the web page and its contents. The application of models such as BERT, LSTM, Decision Trees and their amalgamation as an ensemble result in a pragmatic solution to the problem in the form of an ensemble giving an accuracy of 95.3%. It also uses concepts such as web page reputation, Internal Links and External Links of a web page. The method of classification used in this paper where both Natural Language Processing techniques and Machine Learning models with such a vast variety of features have been combined has not been implemented earlier. We conclude the paper by suggesting methods to improve to solve the problem.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Detection of malicious URLs using machine learning
    Reyes-Dorta, Nuria
    Caballero-Gil, Pino
    Rosa-Remedios, Carlos
    [J]. WIRELESS NETWORKS, 2024,
  • [2] Detecting Malicious URLs using Machine Learning Techniques
    Vanhoenshoven, Frank
    Napoles, Gonzalo
    Falcon, Rafael
    Vanhoof, Keen
    Koppen, Mario
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [3] An Analysis Employing Various Machine Learning Algorithms for Detection of Malicious URLs
    Rizvi, Fizza
    Mohi ud din, Saika
    Sharma, Nonita
    Sharma, Deepak Kumar
    [J]. Communications in Computer and Information Science, 2023, 1782 CCIS : 235 - 241
  • [4] Classification of Malicious URLs Using Machine Learning
    Abad, Shayan
    Gholamy, Hassan
    Aslani, Mohammad
    [J]. SENSORS, 2023, 23 (18)
  • [5] Detecting Malicious URLs Using Machine Learning Techniques: Review and Research Directions
    Aljabri, Malak
    Altamimi, Hanan S.
    Albelali, Shahd A.
    Al-Harbi, Maimunah
    Alhuraib, Haya T.
    Alotaibi, Najd K.
    Alahmadi, Amal A.
    Alhaidari, Fahd
    Mohammad, Rami Mustafa A.
    Salah, Khaled
    [J]. IEEE ACCESS, 2022, 10 : 121395 - 121417
  • [6] Detecting malicious COVID-19 URLs using machine learning techniques
    Ispahany, Jamil
    Islam, Rafiqul
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2021, : 718 - 723
  • [7] Detection of Phishing URLs Using Machine Learning Techniques
    James, Joby
    Sandhya, L.
    Thomas, Ciza
    [J]. 2013 INTERNATIONAL CONFERENCE ON CONTROL COMMUNICATION AND COMPUTING (ICCC), 2013, : 304 - +
  • [8] A Heterogeneous Machine Learning Ensemble Framework for Malicious Webpage Detection
    Shin, Sam-Shin
    Ji, Seung-Goo
    Hong, Sung-Sam
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [9] Malicious url detection using machine learning and ensemble modeling
    Pakhare P.S.
    Krishnan S.
    Charniya N.N.
    [J]. Lecture Notes on Data Engineering and Communications Technologies, 2021, 66 : 839 - 850
  • [10] Learning to Detect Malicious URLs
    Ma, Justin
    Saul, Lawrence K.
    Savage, Stefan
    Voelker, Geoffrey M.
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)