The Role of Feature Selection in Machine Learning for Detection of Spam and Phishing Attacks

被引:4
|
作者
Salihovic, Ina [1 ]
Serdarevic, Haris [1 ]
Kevric, Jasmin [1 ]
机构
[1] Int Burch Univ, Sarajevo 71000, Bosnia & Herceg
关键词
Phishing; Spam emails; Machine learning; Feature selection;
D O I
10.1007/978-3-030-02577-9_47
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the increase in Internet use throughout the world, expansion in network security is indispensable since it decreases the chances of privacy spoofing, identity or information theft and bank frauds. Two of the most frequent network security breaches involve phishing and spam emails as they are an easy way to pass a virus or a malicious site, which can lead to extensive frauds. Despite the fact that there is an abundance of tools for detection and blocking of these types of messages and websites, society is still trying to combat and rise above said problem. The purpose of this paper was to exclude the human factor in security breaches executed in this manner with the use of various machine learning algorithms. For the purpose of training and testing of the most successful algorithms (Random Forest, k-Nearest Neighbor, Artificial Neural Network, Support Vector Machine, Logistic Regression, Naive Bayes) paper used two separate bases, UCIs Phishing Websites Data Set and Spam Emails Dataset together with Weka software, and found that the best results for both of them are achieved with the Random Forest algorithm. However, databases responded differently to feature selection algorithms, as the best result for phishing (97.33% accuracy) was accomplished through Ranker + Principal Components Optimization, and the best result for spam (94.24% accuracy) was accomplished through BestFirst + CfsSubsEval Optimization in Weka. These findings provide a base platform for future work towards a faster and more accurate online fraud detection.
引用
收藏
页码:476 / 483
页数:8
相关论文
共 50 条
  • [41] Phishing Webpage Detection using Feature Selection Methods
    Savyanavar, Amit S.
    Dr, Pradnya Sankpal
    Mhala, Nikhil C.
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 447 - 452
  • [42] Phish Responder: A Hybrid Machine Learning Approach to Detect Phishing and Spam Emails
    Dewis, Molly
    Viana, Thiago
    [J]. APPLIED SYSTEM INNOVATION, 2022, 5 (04)
  • [43] Improved machine learning technique for feature reduction and its application in spam email detection
    Ewees, Ahmed A.
    Gaheen, Marwa A.
    Alshahrani, Mohammed M.
    Anter, Ahmed M.
    Ismail, Fatma H.
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024,
  • [44] Genetic-based Feature Selection for Spam Detection
    Arani, Seyyed Hossein Seyyedi
    Mozaffari, Saeed
    [J]. 2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [45] Spam Detection Using Feature Selection and Parameters Optimization
    Lee, Sang Min
    Kim, Dong Seong
    Kim, Ji Ho
    Park, Jong Sou
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2010), 2010, : 883 - 888
  • [46] An effective feature selection method for web spam detection
    Asdaghi, Faeze
    Soleimani, Ali
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 166 : 198 - 206
  • [47] Phishing websites detection using machine learning
    Kulkarni, Arun
    Brown, Leonard L.
    [J]. International Journal of Advanced Computer Science and Applications, 2019, 10 (07): : 8 - 13
  • [48] Detection of Phishing Websites using Machine Learning
    Razaque, Abdul
    Frej, Mohamed Ben Haj
    Sabyrov, Dauren
    Shaikhyn, Aidana
    Amsaad, Fathi
    Oun, Ahmed
    [J]. 2020 IEEE CLOUD SUMMIT, 2020, : 103 - 107
  • [49] Phishing and Smishing Detection Using Machine Learning
    El Karhani, Hadi
    Al Jamal, Riad
    Samra, Yorgo Bou
    Elhajj, Imad H.
    Kayssi, Ayman
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2023, : 206 - 211
  • [50] Detection of Phishing Websites Using Machine Learning
    Abbas, Ahmed Raad
    Singh, Sukhvir
    Kau, Mandeep
    [J]. INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 1307 - 1314