Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace

被引:7
|
作者
Xu, Haitao [1 ]
Li, Zhao [2 ]
Chu, Chen [2 ]
Chen, Yuanmi [2 ]
Yang, Yifan [2 ]
Lu, Haifeng [2 ]
Wang, Haining [3 ]
Stavrou, Angelos [4 ]
机构
[1] Arizona State Univ, Glendale, AZ 85306 USA
[2] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China
[3] Univ Delaware, Newark, DE 19716 USA
[4] George Mason Univ, Fairfax, VA 22030 USA
来源
关键词
D O I
10.1007/978-3-319-98989-1_8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A certain amount of web traffic is attributed to web bots on the Internet. Web bot traffic has raised serious concerns among website operators, because they usually consume considerable resources at web servers, resulting in high workloads and longer response time, while not bringing in any profit. Even worse, the content of the pages it crawled might later be used for other fraudulent activities. Thus, it is important to detect web bot traffic and characterize it. In this paper, we first propose an efficient approach to detect web bot traffic in a large e-commerce marketplace and then perform an in-depth analysis on the characteristics of web bot traffic. Specifically, our proposed bot detection approach consists of the following modules: (1) an Expectation Maximization (EM)based feature selection method to extract the most distinguishable features, (2) a gradient based decision tree to calculate the likelihood of being a bot IP, and (3) a threshold estimation mechanism aiming to recover a reasonable amount of non-bot traffic flow. The detection approach has been applied on Taobao/Tmall platforms, and its detection capability has been demonstrated by identifying a considerable amount of web bot traffic. Based on data samples of traffic originating from web bots and normal users, we conduct a comparative analysis to uncover the behavioral patterns of web bots different from normal users. The analysis results reveal their differences in terms of active time, search queries, item and store preferences, and many other aspects. These findings provide new insights for public websites to further improve web bot traffic detection for protecting valuable web contents.
引用
收藏
页码:143 / 163
页数:21
相关论文
共 50 条
  • [41] Competitive dynamics of e-commerce web sites
    Li Yanhui
    Zhu Siming
    APPLIED MATHEMATICAL MODELLING, 2007, 31 (05) : 912 - 919
  • [42] Web-based coordination for E-commerce
    Department of Mechanical Engineering & Automation, Eastern LiaoNing University, Dandong
    118000, China
    IFIP Advances in Information and Communication Technology, 2007, (507-514) : 507 - 514
  • [43] E-commerce and the worldwide web: A case study
    Ames, C
    NETWORKING AND COMMUNICATIONS ON THE PLANT FLOOR, 1999, 392 : 1 - 9
  • [44] Web data integration for E-commerce applications
    Hasselbring, W
    IEEE MULTIMEDIA, 2003, 9 (01) : 16 - 25
  • [45] Web data integration for E-commerce applications
    Hasselbring, Wilhelm
    2002, Institute of Electrical and Electronics Engineers Computer Society (09)
  • [46] Implications of Web assurance services on e-commerce
    Runyan, Bruce
    Smith, Katherine T.
    Smith, L. Murphy
    ACCOUNTING FORUM, 2008, 32 (01) : 46 - 61
  • [47] E-Commerce Web Accessibility for People with Disabilities
    Sohaib, Osama
    Kang, Kyeong
    COMPLEXITY IN INFORMATION SYSTEMS DEVELOPMENT, 2017, 22 : 87 - 100
  • [48] Learning DOM Trees of Web Pages by Subpath Kernel and Detecting Fake e-Commerce Sites
    Shin, Kilho
    Ishikawa, Taichi
    Liu, Yu-Lu
    Shepard, David Lawrence
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2021, 3 (01): : 95 - 122
  • [49] Comparative Analysis of Software Quality Model In The Selection of Marketplace E-Commerce
    Wahdiniwaty, Rahma
    Setiawan, Eko Budi
    Wahab, Deden A.
    2018 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2018, : 386 - 391
  • [50] Scamdog Millionaire: Detecting E-commerce Scams in the Wild
    Kotzias, Platon
    Roundy, Kevin
    Pachilakis, Michalis
    Sanchez-Rola, Iskander
    Bilge, Leyla
    39TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2023, 2023, : 29 - 43