A high-accuracy phishing website detection method based on machine learning

被引:9
|
作者
Bahaghighat, Mahdi [1 ]
Ghasemi, Majid [1 ]
Ozen, Figen [2 ]
机构
[1] Imam Khomeini Int Univ, Dept Comp Engn, Qazvin, Iran
[2] Halic Univ, Istanbul, Turkiye
关键词
Phishing website detection; Cyber security; Machine learning; Classification; XGBoost;
D O I
10.1016/j.jisa.2023.103553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid development of e-commerce, e-banking, and social networks has made phishing attack detection one of the most critical technologies in all cyber security systems. To improve the efficiency of anti-phishing techniques, we present an improved predictive model based on machine learning. The proposed method uses six different algorithms; Logistic Regression, K-Nearest Neighbors, Naive Bayes, Random Forest, Support Vector Machine, and Extreme Gradient Boosting (XGBoost). The experiments are based on a public dataset of 58,000 legitimate websites and 30,647 phishing ones, including 112 attributes for each sample. Our evaluations in the feature selection process show that after balancing the dataset and dropping constant features, a noticeable improvement can be achieved. We conducted our evaluation found on eight major unique scenarios. The experimental results of our phishing websites detection (PWD) method indicate remarkable performances in which each algorithm reached an accuracy of more than 93%, and the XGBoost classifier outperforms others with 99.2% overall accuracy, 99.1% precision, 99.4% recall, and 99.1% specificity. In addition, the study achieved optimal run-time of about 1500 ms for the XGBoost algorithm without dimension reduction while using Principal Component Analysis (PCA) reduces it down to just 869 ms. As a result, the proposed approach would be practical in both offline and real-time applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Phishing Website Detection Based on Machine Learning: A Survey
    Singh, Charu
    Meenu
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 398 - 404
  • [2] Phishing website detection based on effective machine learning approach
    Harinahalli Lokesh, Gururaj
    BoreGowda, Goutham
    Journal of Cyber Security Technology, 2021, 5 (01) : 1 - 14
  • [3] A Survey of Machine Learning-Based Solutions for Phishing Website Detection
    Tang, Lizhen
    Mahmoud, Qusay H.
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2021, 3 (03): : 672 - 694
  • [4] Phishing Website Classification and Detection Using Machine Learning
    Kumar, Jitendra
    Santhanavijayan, A.
    Janet, B.
    Rajendran, Balaji
    Bindhumadhava, B. S.
    2020 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2020), 2020, : 473 - 478
  • [5] Detection of Phishing Website Using Machine Learning Approach
    Vilas, Mahajan Mayuri
    Ghansham, Kakade Prachi
    Jaypralash, Sawant Purva
    Shila, Pawar
    2019 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2019, : 384 - +
  • [6] Intelligent phishing website detection using machine learning
    Ashish Kumar Jha
    Raja Muthalagu
    Pranav M. Pawar
    Multimedia Tools and Applications, 2023, 82 : 29431 - 29456
  • [7] Intelligent phishing website detection using machine learning
    Jha, Ashish Kumar
    Muthalagu, Raja
    Pawar, Pranav M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29431 - 29456
  • [8] Machine learning approach for phishing website detection : A literature survey
    Patil, Rutuja R.
    Kaur, Gagandeep
    Jain, Himank
    Tiwari, Ayush
    Joshi, Soham
    Rao, Keshav
    Sharma, Amit
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2022, 25 (03): : 817 - 827
  • [9] Comparative Study of Machine Learning Algorithms for Phishing Website Detection
    Omari, Kamal
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 417 - 425
  • [10] Towards benchmark datasets for machine learning based website phishing detection: An experimental study
    Hannousse, Abdelhakim
    Yahiouche, Salima
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 104