Bot recognition in a Web store: An approach based on unsupervised learning

被引:22
|
作者
Rovetta, Stefano [1 ]
Suchacka, Grazyna [2 ]
Masulli, Francesco [1 ]
机构
[1] Univ Genoa, Dept Informat Bioengn Robot & Syst Engn, Genoa, Italy
[2] Univ Opole, Inst Informat, Opole, Poland
关键词
Web bot; Internet robot; Web bot detection; Supervised classification; Unsupervised classification; Machine learning; Web server; ROBOT DETECTION; NEURAL-NETWORK; BEHAVIOR; ATTACKS;
D O I
10.1016/j.jnca.2020.102577
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Web traffic on e-business sites is increasingly dominated by artificial agents (Web bots) which pose a threat to the website security, privacy, and performance. To develop efficient bot detection methods and discover reliable e-customer behavioural patterns, the accurate separation of traffic generated by legitimate users and Web bots is necessary. This paper proposes a machine learning solution to the problem of bot and human session classification, with a specific application to e-commerce. The approach studied in this work explores the use of unsupervised learning (k-means and Graded Possibilistic c-Means), followed by supervised labelling of clusters, a generative learning strategy that decouples modelling the data from labelling them. Its efficiency is evaluated through experiments on real e-commerce data, in realistic conditions, and compared to that of supervised learning classifiers (a multi-layer perceptron neural network and a support vector machine). Results demonstrate that the classification based on unsupervised learning is very efficient, achieving a similar performance level as the fully supervised classification. This is an experimental indication that the bot recognition problem can be successfully dealt with using methods that are less sensitive to mislabelled data or missing labels. A very small fraction of sessions remain misclassified in both cases, so an in-depth analysis of misclassified samples was also performed. This analysis exposed the superiority of the proposed approach which was able to correctly recognize more bots, in fact, and identified more camouflaged agents, that had been erroneously labelled as humans.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Handwritten Numeral Recognition Method Based on STDP Based with Unsupervised Learning
    Xie, Yonghong
    Liu, Yijun
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 839 - 842
  • [22] Bayes empirical Bayes approach to unsupervised learning of parameters in pattern recognition
    Li, TF
    PATTERN RECOGNITION, 2000, 33 (02) : 333 - 340
  • [23] UNSUPERVISED LEARNING APPROACH TO FEATURE ANALYSIS FOR AUTOMATIC SPEECH EMOTION RECOGNITION
    Eskimez, Sefik Emre
    Duan, Zhiyao
    Heinzelman, Wendi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5099 - 5103
  • [24] Detecting Amazon Bot Reviewers Using Unsupervised and Supervised Learning
    Wood, Brandon
    Slhoub, Khaled
    2022 IEEE WORLD AI IOT CONGRESS (AIIOT), 2022, : 303 - 310
  • [25] A Graph-Based Machine Learning Approach for Bot Detection
    Abou Daya, Abbas
    Salahuddin, Mohammad A.
    Limam, Noura
    Boutaba, Raouf
    2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), 2019, : 144 - 152
  • [26] Unsupervised Feature Learning for EEG-based Emotion Recognition
    Lan, Zirui
    Sourina, Olga
    Wang, Lipo
    Scherer, Reinhold
    Mueller-Putz, Gernot
    2017 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW), 2017, : 182 - 185
  • [27] Study of Network Traffic Recognition Based on Unsupervised Learning Method
    Pang Bin
    Li Hua
    EBM 2010: INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT, VOLS 1-8, 2010, : 5105 - +
  • [28] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
    Ying, Yangwei
    Tu, Yuanwu
    Zhou, Hong
    ELECTRONICS, 2021, 10 (17)
  • [29] Unsupervised Internet-Based Category Learning for Object Recognition
    Antunes, Mario
    Lopes, Luis Seabra
    IMAGE ANALYSIS AND RECOGNITION, 2013, 7950 : 766 - 773
  • [30] An Efficient Feature Selection Technique of Unsupervised Learning Approach for Analyzing Web Opinions
    Valli, M. S.
    Arasu, G. T.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2016, 75 (04): : 221 - 224