A keyword-based combination approach for detecting phishing webpages

被引:28
|
作者
Ding, Yan [1 ]
Luktarhan, Nurbol [1 ]
Li, Keqin [2 ]
Slamu, Wushour [1 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi, Peoples R China
[2] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY USA
基金
中国博士后科学基金;
关键词
Heuristic rule; Machine learning; Phishing; Search engine; URL obfuscation techniques;
D O I
10.1016/j.cose.2019.03.018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the Search & Heuristic Rule & Logistic Regression (SHLR) combination detection method is proposed for detecting the obfuscation techniques commonly used by phishing websites and improving the filtering efficiency of legitimate webpages. The method is composed of three steps. First, the title tag content of the webpage is input as search keywords to the Baidu search engine, and the webpage is considered legal if the webpage domain matches the domain name of any of the top-10 search results; otherwise, further evaluation is performed. Second, if the webpage cannot be identified as legal, then the webpage is further examined to determine whether it is a phishing page based on the heuristic rules defined by the character features. The first two steps can quickly filter webpages to meet the needs of real-time detection. Finally, a logistic regression classifier is used to assess the remaining pages to enhance the adaptability and accuracy of the detection method. The experimental results show that the SHLR can filter 61.9% of legitimate webpages and identify 22.9% of phishing webpages based on uniform/universal resource locator (URL) lexical information. The accuracy of the SHLR is 98.9%; thus, its phishing detection performance is high. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:256 / 275
页数:20
相关论文
共 50 条
  • [1] Computer Vision Based Framework For Detecting Phishing Webpages
    Cernica, Ionut
    Popescu, Nirvana
    2020 19TH ROEDUNET CONFERENCE: NETWORKING IN EDUCATION AND RESEARCH (ROEDUNET), 2020,
  • [2] DeltaPhish: Detecting Phishing Webpages in Compromised Websites
    Corona, Igino
    Biggio, Battista
    Contini, Matteo
    Piras, Luca
    Corda, Roberto
    Mereu, Mauro
    Mureddu, Guido
    Ariu, Davide
    Roli, Fabio
    COMPUTER SECURITY - ESORICS 2017, PT I, 2018, 10492 : 370 - 388
  • [3] A comprehensive and efficacious architecture for detecting phishing webpages
    Gowtham, R.
    Krishnamurthi, Ilango
    COMPUTERS & SECURITY, 2014, 40 : 23 - 37
  • [4] PhishMon: A Machine Learning Framework for Detecting Phishing Webpages
    Niakanlahiji, Amirreza
    Chu, Bei-Tseng
    Al-Shaer, Ehab
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 220 - 225
  • [5] An effective and efficient approach for keyword-based XML retrieval
    Li, XG
    Gong, H
    Wang, DL
    Yu, G
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 56 - 67
  • [6] Keyword-based Approach for Lyrics Emotion Variation Detection
    Malheiro, Ricardo
    Oliveira, Hugo Goncalo
    Gomes, Paulo
    Paiva, Rui Pedro
    KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 33 - 44
  • [7] Keyword-based Topic Modeling and Keyword Selection
    Wang, Xingyu
    Zhang, Lida
    Klabjan, Diego
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 1148 - 1154
  • [8] Detecting phishing webpages via homology analysis of webpage structure
    Feng J.
    Qiao Y.
    Ye O.
    Zhang Y.
    PeerJ Computer Science, 2022, 8
  • [9] Keyword-based Vehicle Retrieval
    Park, Eun-Ju
    Kim, Hoyoung
    Jeong, Seonghwan
    Kang, Byungkon
    Kwon, YoungMin
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4215 - 4222
  • [10] Detecting phishing webpages via homology analysis of webpage structure
    Feng, Jian
    Qiao, Yuqiang
    Ye, Ou
    Zhang, Ying
    PEERJ COMPUTER SCIENCE, 2022, 8