ENiD: An Encrypted Web Pages Traffic Identification Based on Web Visiting Behavior

被引:0
|
作者
Ge, Mengmeng [1 ]
Yu, Xiangzhan [2 ]
Sachidananda, Vinay Mysore [3 ]
Liu, Shangqing [3 ]
Liu, Likun [2 ]
机构
[1] Nanyang Technol Univ, Harbin Inst Technol, Sch Cyberspace Sci, Harbin, Peoples R China
[2] Harbin Inst Technol, Sch Cyberspace Sci, Harbin, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
黑龙江省自然科学基金; 中国国家自然科学基金;
关键词
web pages; traffic identification; encrypted traffic; traffic blocks; machine learning; CLASSIFICATION; NETWORK;
D O I
10.1109/ICDMW58026.2022.00082
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of network encryption technologies, more websites use encrypted web pages to protect users' data security. Despite this, attackers use encrypted web pages to hide real web content, such as phishing pages, malware, bots, etc. Detecting such vulnerable web pages and malicious phishing websites can be accomplished by identifying encrypted web pages. In recent years, using traffic features and machine learning to identify encrypted web pages is one of the most important research directions in cyber security. In this paper, we propose ENiD, an encrypted web page traffic identification approach. This method uses upload-only blocks and accumulation response size to describe the web page visiting process. Based on a large number of encrypted web page traffic case studies, we evaluated the contributions of different features and selected those features that contributed the most. We first capture and publish the encrypted web pages traffic dataset, which contains 8,480 web pages traffic. We evaluate our method's effectiveness by four machine learning algorithms, which shows that our approach achieved accuracy and an F1 score of 0.97 on 50 web pages. Moreover, we evaluate the effectiveness of ENiD on different numbers of web pages, and the results demonstrate that our methods are still effective on more than 400 web pages.
引用
收藏
页码:593 / 601
页数:9
相关论文
共 50 条
  • [21] Revealing QoE of Web Users from Encrypted Network Traffic
    Huet, Alexis
    Saverimoutou, Antoine
    Ben Houidi, Zied
    Shi, Hao
    Cai, Shengming
    Xu, Jinchun
    Mathieu, Bertrand
    Rossi, Dario
    [J]. 2020 IFIP NETWORKING CONFERENCE AND WORKSHOPS (NETWORKING), 2020, : 28 - 36
  • [22] A Look Behind the Curtain: Traffic Classification in an Increasingly Encrypted Web
    Akbari, Iman
    Salahuddin, Mohammad A.
    Ven, Leni
    Limam, Noura
    Boutaba, Raouf
    Mathieu, Bertrand
    Moteau, Stephanie
    Tuffin, Stephane
    [J]. PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2021, 5 (01)
  • [23] Encrypted Web traffic dataset: Event logs and packet traces
    Spacek, Stanislav
    Velan, Petr
    Celeda, Pavel
    Tovarnak, Daniel
    [J]. DATA IN BRIEF, 2022, 42
  • [24] An Efficient Deep Learning Method for Encrypted Traffic Classification on the Web
    Soleymanpour, Shiva
    Sadr, Hossein
    Beheshti, Homayoun
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 209 - 216
  • [25] Clustering Web pages based on their structure
    Crescenzi, V
    Merialdo, P
    Missier, P
    [J]. DATA & KNOWLEDGE ENGINEERING, 2005, 54 (03) : 279 - 299
  • [26] Browser Identification Based on Encrypted Traffic
    Liu, Changjiang
    Han, Jiesi
    Wei, Qiang
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, INFORMATION MANAGEMENT AND NETWORK SECURITY, 2016, 47 : 360 - 363
  • [27] Identification of Malicious Web Pages Through Analysis of Underlying DNS and Web Server Relationships
    Seifert, Christian
    Welch, Ian
    Komisarczuk, Peter
    Aval, Chiraag Uday
    Endicott-Popovsky, Barbara
    [J]. 2008 IEEE 33RD CONFERENCE ON LOCAL COMPUTER NETWORKS, VOLS 1 AND 2, 2008, : 910 - +
  • [28] AN INVESTIGATION OF CLUSTERING ALGORITHMS IN THE IDENTIFICATION OF SIMILAR WEB PAGES
    De Lucia, Andrea
    Risi, Michele
    Scanniello, Giuseppe
    Tortora, Genoveffa
    [J]. JOURNAL OF WEB ENGINEERING, 2009, 8 (04): : 346 - 370
  • [29] Automatic Identification of Temporal Information in Tourism Web Pages
    Weiser, Stephanie
    Laublet, Philippe
    Minel, Jean-Luc
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 127 - 131
  • [30] Rule identification from Web pages by the XRML approach
    Kang, J
    Lee, JK
    [J]. DECISION SUPPORT SYSTEMS, 2005, 41 (01) : 205 - 227