On Feature Selection for the Prediction of Phishing Websites

被引:2
|
作者
Fadheel, Wesam [1 ]
Abusharkh, Mohamed [2 ]
Abdel-Qader, Ikhlas [3 ]
机构
[1] Western Michigan Univ, Dept Comp Sci, Kalamazoo, MI 49008 USA
[2] Ferris State Univ, Sch Digital Media, Grand Rapids, MI USA
[3] Western Michigan Univ, Dept Elect & Comp Engn, Kalamazoo, MI 49008 USA
关键词
D O I
10.1109/DASC-PICom-DataCom-CyberSciTec.2017.146
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
with the rise of the big data paradigm, large data sets are being made available for knowledge mining. While this open up possibilities for new insights being gained every day, it also exposes data consumers to an increase in low quality, unreliable, redundant or noisy portions of the data. This would negatively affect the process of harvesting knowledge and recognizing patterns. Therefore, efficient feature selection methods to empower for real-time prediction or classification systems. Feature selection is the process of identifying the most relevant attributes and removing the redundant and irrelevant attributes. In this study, we implemented Kaiser-Meyer-Olkin (KMO) Test as a feature selection method and applied that to a publicly available phishing dataset, namely, the UCI of phishing website. furthermore, we used Logistic Regression and Support Vector Machine as classification methods to validate the feature selection method. Our results show just a slight difference in accuracy between implementation using full dataset features and the proposed much smaller dataset (almost 63% of original features set). This reduction in dimensionality is significant for the real-time systems especially when the accuracy reduction is slight. From there, we present a framework enabling a significant reduction in features. This opens the door for future work under which a wider set of classification algorithms will be tested in order to achieve the dimensionality reduction and an increase in performance accuracy.
引用
收藏
页码:871 / 876
页数:6
相关论文
共 50 条
  • [21] Hybrid Feature Selection for Phishing Email Detection
    Hamid, Isredza Rahmi A.
    Abawajy, Jemal
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PT II, 2011, 7017 : 266 - 275
  • [22] Phishing Detection Using Significant Feature Selection
    Goswami, D. N.
    Shukla, Manali
    Chaturvedi, Anshu
    2020 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2020), 2020, : 302 - 306
  • [23] New Hybrid Features Selection Method: A Case Study on Websites Phishing
    Rajab, Khairan D.
    SECURITY AND COMMUNICATION NETWORKS, 2017, : 1 - 10
  • [24] An Approach to Detect Phishing Websites with Features Selection Method and Ensemble Learning
    Khatun, Mahmuda
    Mozumder, Md Akib Ikbal
    Polash, Md. Nazmul Hasan
    Hasan, Md Rakib
    Ahammad, Khalil
    Shaiham, Md Shibly
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 768 - 775
  • [25] Identifying Phishing Websites Based on URL Multi-Granularity Feature Fusion
    Zhongyi H.
    Shuoguo Z.
    Jiang W.
    Data Analysis and Knowledge Discovery, 2022, 6 (11): : 103 - 110
  • [26] A Boosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Learning Model to Detect Phishing Websites
    Kalabarige, Lakshmana Rao
    Rao, Routhu Srinivasa
    Pais, Alwyn R. R.
    Gabralla, Lubna Abdelkareim
    IEEE ACCESS, 2023, 11 : 71180 - 71193
  • [27] Phishing Webpage Detection using Feature Selection Methods
    Savyanavar, Amit S.
    Dr, Pradnya Sankpal
    Mhala, Nikhil C.
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 447 - 452
  • [28] Datasets for phishing websites detection
    Vrbancic, Grega
    Fister, Iztok, Jr.
    Podgorelec, Vili
    DATA IN BRIEF, 2020, 33
  • [29] Addressing feature selection and extreme learning machine tuning by diversity-oriented social network search: an application for phishing websites detection
    Bacanin, Nebojsa
    Zivkovic, Miodrag
    Antonijevic, Milos
    Venkatachalam, K.
    Lee, Jinseok
    Nam, Yunyoung
    Marjanovic, Marina
    Strumberger, Ivana
    Abouhawwash, Mohamed
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 7269 - 7304
  • [30] Detection of phishing websites using an efficient feature-based machine learning framework
    Routhu Srinivasa Rao
    Alwyn Roshan Pais
    Neural Computing and Applications, 2019, 31 : 3851 - 3873