A novel phishing website classification method based on hybrid sampling

被引:0
|
作者
Srivastava, Jaya [1 ]
Sharan, Aditi [2 ]
机构
[1] Computer Services Centre, Indian Institute of Technology Delhi, New Delhi, India
[2] Jawaharlal Nehru University, New Delhi, India
关键词
Anomaly detection - Classification (of information) - Computer crime - Cybersecurity - Logistic regression - Support vector machines;
D O I
10.1080/23742917.2023.2240606
中图分类号
学科分类号
摘要
In real-world anomaly detection tasks such as Credit Card Fraud Detection, Cancer Patients Detection, Phishing Website Detection, etc., the training datasets often suffer from skewed class distribution. But the traditional Machine Learning (ML) classification algorithms assume balanced class distribution and equal misclassification costs. As a result, when class-imbalanced data are presented to the traditional ML algorithms they tend to produce biased and inaccurate predictive ML models. In this study, we propose four novel Phishing Website Classification models namely, SMOTEENN-XGB, SMOTEENN-RF, SMOTEENN-LR, and SMOTEENN-SVM by combining SMOTEENN (SMOTE + ENN) hybrid sampling technique with eXtreme Gradient Boosting (XGB), Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM) classifiers respectively. We propose the use of SMOTEENN hybrid sampling as the novel approach to address the problem of class imbalance in Phishing Website datasets prior to building classification models. To the best of our knowledge and belief, our novel proposed four models SMOTEENN-XGB, SMOTEENN-RF, SMOTEEEN-LR, and SMOTEENN-SVM for Phishing Website Detection based on SMOTEENN hybrid sampling approach have not been published in the existing studies as of now. © 2023 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:1 / 30
相关论文
共 50 条
  • [1] Feature Selection for Phishing Website Classification
    Shabudin, Shafaizal
    Sani, Nor Samsiah
    Ariffin, Khairul Akram Zainal
    Aliff, Mohd
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 587 - 595
  • [2] Phishing Website Classification: A Machine Learning Approach
    Akanbi, Oluwatobi
    Abunadi, Ahmad
    Zainal, Anazida
    JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2014, 9 (04): : 222 - 234
  • [3] Phishing Website Classification: A Machine Learning Approach
    Akanbi, Oluwatobi
    Abunadi, Ahmad
    Zainal, Anazida
    JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2014, 9 (06): : 354 - 366
  • [4] Website Phishing Technique Classification Detection with HSSJAYA Based MLP Training
    Erdemir, Erkan
    Altun, Adem Alpaslan
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2022, 29 (05): : 1696 - 1705
  • [5] Phishing website detection method based on logistic regression and XGBoost
    Yang P.
    Zeng P.
    Zhao G.
    Lü P.
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2019, 49 (02): : 207 - 212
  • [6] KGhish: A Phishing Website Detection Method Based on Knowledge Graph
    Liu, Changlin
    Wang, Shanshan
    Chen, Zhenxiang
    Huang, Limei
    Li, Yan
    Li, Hanwen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XIII, ICIC 2024, 2024, 14874 : 300 - 311
  • [7] Intelligent phishing website detection using classification ensemble
    Zhuang, Wei-Wei
    Ye, Yan-Fang
    Li, Tao
    Jiang, Qing-Shan
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2011, 31 (10): : 2008 - 2020
  • [8] Phishing Website Classification and Detection Using Machine Learning
    Kumar, Jitendra
    Santhanavijayan, A.
    Janet, B.
    Rajendran, Balaji
    Bindhumadhava, B. S.
    2020 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2020), 2020, : 473 - 478
  • [9] COMPARISON OF MACHINE LEARNING TECHNIQUES IN PHISHING WEBSITE CLASSIFICATION
    Hodzic, Adnan
    Kevric, Jasmin
    Karadag, Adem
    INTERNATIONAL CONFERENCE ON ECONOMIC AND SOCIAL STUDIES (ICESOS'16): REGIONAL ECONOMIC DEVELOPMENT: ENTREPNEURSHIP AND INNOVATION, 2016, : 249 - 256
  • [10] Intelligent Association Classification Technique for Phishing Website Detection
    Al-Fayoumi, Mustafa
    Alwidian, Jaber
    Abusaif, Mohammad
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2020, 17 (04) : 488 - 496