A keyword-based combination approach for detecting phishing webpages

被引:28
|
作者
Ding, Yan [1 ]
Luktarhan, Nurbol [1 ]
Li, Keqin [2 ]
Slamu, Wushour [1 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi, Peoples R China
[2] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY USA
基金
中国博士后科学基金;
关键词
Heuristic rule; Machine learning; Phishing; Search engine; URL obfuscation techniques;
D O I
10.1016/j.cose.2019.03.018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the Search & Heuristic Rule & Logistic Regression (SHLR) combination detection method is proposed for detecting the obfuscation techniques commonly used by phishing websites and improving the filtering efficiency of legitimate webpages. The method is composed of three steps. First, the title tag content of the webpage is input as search keywords to the Baidu search engine, and the webpage is considered legal if the webpage domain matches the domain name of any of the top-10 search results; otherwise, further evaluation is performed. Second, if the webpage cannot be identified as legal, then the webpage is further examined to determine whether it is a phishing page based on the heuristic rules defined by the character features. The first two steps can quickly filter webpages to meet the needs of real-time detection. Finally, a logistic regression classifier is used to assess the remaining pages to enhance the adaptability and accuracy of the detection method. The experimental results show that the SHLR can filter 61.9% of legitimate webpages and identify 22.9% of phishing webpages based on uniform/universal resource locator (URL) lexical information. The accuracy of the SHLR is 98.9%; thus, its phishing detection performance is high. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:256 / 275
页数:20
相关论文
共 50 条
  • [21] Matching Similarity for Keyword-Based Clustering
    Rezaei, Mohammad
    Franti, Pasi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 193 - 202
  • [22] A transfer learning approach to interdisciplinary document classification with keyword-based explanation
    Xiaoming Huang
    Peihu Zhu
    Yuwen Chen
    Jian Ma
    Scientometrics, 2023, 128 : 6449 - 6469
  • [23] A systematic approach for identifying technology opportunities: Keyword-based morphology analysis
    Yoon, B
    Park, Y
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2005, 72 (02) : 145 - 160
  • [24] Keyword-based information retrieval for the WoT
    Xylomenos, George
    Zafeiratos, Evangelos
    Prokopakis, Marios
    Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, 2019, : 407 - 412
  • [25] Keyword-Based Source Code Summarization
    Zhang S.
    Xie R.
    Ye W.
    Hen L.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (09): : 1987 - 2000
  • [26] Keyword-based Search and Exploration on Databases
    Chen, Yi
    Wang, Wei
    Liu, Ziyang
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1380 - 1383
  • [27] Reverse Keyword-Based Location Search
    Xie, Xike
    Lin, Xin
    Xu, Jianliang
    Jensen, Christian S.
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 375 - 386
  • [28] Keyword-based information retrieval for the WoT
    Xylomenos, George
    Zafeiratos, Evangelos
    Prokopakis, Marios
    SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 407 - 412
  • [29] A transfer learning approach to interdisciplinary document classification with keyword-based explanation
    Huang, Xiaoming
    Zhu, Peihu
    Chen, Yuwen
    Ma, Jian
    SCIENTOMETRICS, 2023, 128 (12) : 6449 - 6469
  • [30] A semi-supervised learning approach for detection of phishing webpages
    Li, Yuancheng
    Xiao, Rui
    Feng, Jingang
    Zhao, Liujun
    OPTIK, 2013, 124 (23): : 6027 - 6033