A Sampling-Based Method for Detecting Data Poisoning Attacks in Recommendation Systems

被引:0
|
作者
Li, Mohan [1 ]
Lian, Yuxin [1 ]
Zhu, Jinpeng [1 ]
Lin, Jingyi [1 ]
Wan, Jiawen [1 ]
Sun, Yanbin [1 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
关键词
data poisoning; recommendation systems; ensemble learning; data poisoning detection; FRAMEWORK;
D O I
10.3390/math12020247
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The recommendation algorithm based on collaborative filtering is vulnerable to data poisoning attacks, wherein attackers can manipulate system output by injecting a large volume of fake rating data. To address this issue, it is essential to investigate methods for detecting systematically injected poisoning data within the rating matrix. Since attackers often inject a significant quantity of poisoning data in a short period to achieve their desired impact, these data may exhibit spatial proximity. In other words, poisoning data may be concentrated in adjacent rows of the rating matrix. This paper capitalizes on the proximity characteristics of poisoning data in the rating matrix and introduces a sampling-based method for detecting data poisoning attacks. First, we designed a rating matrix sampling method specifically for detecting poisoning data. By sampling differences obtained from the original rating matrix, it is possible to infer the presence of poisoning attacks and effectively discard poisoning data. Second, we developed a method for pinpointing malicious data based on the distance of rating vectors. Through distance calculations, we can accurately identify the positions of malicious data. After that, we validated the method on three real-world datasets. The results demonstrate the effectiveness of our method in identifying malicious data within the rating matrix.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Dynamic Security Analysis of Power Systems by a Sampling-Based Algorithm
    Wu, Qiang
    Koo, T. John
    Susuki, Yoshihiko
    ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS, 2018, 2 (02)
  • [42] Sampling-based Falsification and Verification of Controllers for Continuous Dynamic Systems
    Cheng, Peng
    Kumar, Vijay
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2008, 27 (11-12): : 1232 - 1245
  • [43] Trusted Sampling-Based Result Verification on Mass Data Processing
    Ding, Yan
    Wang, Huaimin
    Shi, Peichang
    Fu, Hongyi
    Guo, Changguo
    Zhang, Muhua
    2013 IEEE SEVENTH INTERNATIONAL SYMPOSIUM ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE 2013), 2013, : 391 - 396
  • [44] Sampling-Based Nonlinear Stochastic Optimal Control for Neuromechanical Systems
    Reed, Emily A.
    Pereira, Marcus A.
    Valero-Cuevas, Francisco J.
    Theodorou, Evangelos A.
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 4694 - 4699
  • [45] Sampling-based flood risk analysis for fluvial dike systems
    Dawson, R
    Hall, J
    Sayers, P
    Bates, P
    Rosu, C
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2005, 19 (06) : 388 - 402
  • [46] Sampling-based flood risk analysis for fluvial dike systems
    Richard Dawson
    Jim Hall
    Paul Sayers
    Paul Bates
    Corina Rosu
    Stochastic Environmental Research and Risk Assessment, 2005, 19 : 388 - 402
  • [47] A Kernel Data Analysis Method for Detecting Flood Attacks
    Cho, Jaeik
    Chung, Manhyun
    Moon, Jongsub
    JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2009, 4 (04): : 274 - 278
  • [48] A Sampling-Based Approach to Accelerating Queries in Log Management Systems
    Wagner, Tal
    Schkufza, Eric
    Wieder, Udi
    COMPANION PROCEEDINGS OF THE 2016 ACM SIGPLAN INTERNATIONAL CONFERENCE ON SYSTEMS, PROGRAMMING, LANGUAGES AND APPLICATIONS: SOFTWARE FOR HUMANITY (SPLASH COMPANION'16), 2016, : 37 - 38
  • [49] Luopan: Sampling-Based Load Balancing in Data Center Networks
    Wang, Peng
    Trimponias, George
    Xu, Hong
    Geng, Yanhui
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (01) : 133 - 145
  • [50] Optimization of Skewed Data Using Sampling-Based Preprocessing Approach
    Mishra, Sushruta
    Mallick, Pradeep Kumar
    Jena, Lambodar
    Chae, Gyoo-Soo
    FRONTIERS IN PUBLIC HEALTH, 2020, 8