Web Proxy Log Classification for Burst Behavior

被引:0
|
作者
Kiatkumjounwong, Nattapol [1 ]
Ngamsuriyaroj, Sudsanguan [1 ]
Plangprasopchok, Anon [2 ]
机构
[1] Mahidol Univ, Fac Informat & Commun Technol, Nakhon Pathom, Thailand
[2] Natl Elect & Comp Technol Ctr, Thailand Sci Pk, Pathum Thani, Thailand
关键词
Web proxy logs; Log classification; Outlier detection; File categories; File types;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Many organizations and most Internet service providers need to keep the history of web accesses in the form of proxy logs. Such logs would be later used for web usage as well as for investigating abnormal activities including an abuse, a misuse or fraud. This paper classifies web proxy logs into normal, non burst and burst. To filter out normal logs, we use Apriori algorithm in Weka mining tool to detect the outlier based on the duration and the bandwidth of logs for file categories. Burst logs are separated out from outlier logs using the threshold rates computed for file types. The experimental results show the majority of about 80% for normal logs, and burst logs count for about 2% which should be further investigated for unusual behavior. Since the number of logs kept on storage would be very huge, it would take a long time to process them timely. Thus, we measure the performance of parallel log processing on a Hadoop system with varying data size and the number of nodes. We found that the speedup of log processing is well corresponded to the increasing workload, and it would be convincing to process logs in real time.
引用
收藏
页码:472 / 477
页数:6
相关论文
共 50 条
  • [1] Prediction of Web Page Accesses by Proxy Server Log
    Wu Y.-H.
    Chen A.L.P.
    [J]. World Wide Web, 2002, 5 (1) : 67 - 88
  • [2] Log Analysis in a HTTP Proxy Server for Accurately Estimating Web QoE
    Sawabe, Anan
    Yoshida, Hiroshi
    Nogami, Kousuke
    [J]. 2018 15TH IEEE ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2018,
  • [3] Analysis of visitor's behavior from Web Log using Web Log Expert Tool
    Kumar, Manoj
    Meenu
    [J]. 2017 INTERNATIONAL CONFERENCE OF ELECTRONICS, COMMUNICATION AND AEROSPACE TECHNOLOGY (ICECA), VOL 2, 2017, : 296 - 301
  • [4] Log Visualization of Intrusion and Prevention Reverse Proxy Server Against Web Attacks
    Mantoro, Teddy
    Aziz, Normaziah Binti Abdul
    Yusoff, Nur Dalilah Binti Meor
    Talib, Nor Aishah Binti Abu
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATICS AND CREATIVE MULTIMEDIA (ICICM), 2013, : 325 - 329
  • [5] Identifying interesting visitors through Web log classification
    Yu, JX
    Ou, YM
    Zhang, CQ
    Zhang, SC
    [J]. IEEE INTELLIGENT SYSTEMS, 2005, 20 (03) : 55 - 59
  • [6] Research on Web Page Classification Method Based on Query Log
    叶飞跃
    马祎星
    [J]. Journal of Shanghai Jiaotong University(Science), 2018, 23 (03) : 404 - 410
  • [7] Web log classification framework with data augmentation based on GANs
    He Mingshu
    Jin Lei
    Wang Xiaojuan
    Li Yuan
    [J]. The Journal of China Universities of Posts and Telecommunications, 2020, 27 (05) : 34 - 46
  • [8] Web log classification framework with data augmentation based on GANs
    He, Mingshu
    Jin, Lei
    Wang, Xiaojuan
    Li, Yuan
    [J]. Wang, Xiaojuan (wj2718@bupt.edu.cn), 1600, Beijing University of Posts and Telecommunications (27): : 34 - 46
  • [9] Research on Web Page Classification Method Based on Query Log
    Ye F.
    Ma Y.
    [J]. Journal of Shanghai Jiaotong University (Science), 2018, 23 (3) : 404 - 410
  • [10] Analysis and Classification of Web Proxy Logs Based on Patterns of Traffic Rates
    Kiatkumjounwong, Nattapol
    Ngamsuriyaroj, Sudsanguan
    Plangprasopchok, Anon
    Hoonlor, Apirak
    [J]. TENCON 2014 - 2014 IEEE REGION 10 CONFERENCE, 2014,