Think Outside the Dataset: Finding Fraudulent Reviews using Cross-Dataset Analysis

被引:13
|
作者
Nilizadeh, Shirin [1 ]
Aghakhani, Hojjat [2 ]
Gustafson, Eric [2 ]
Kruegel, Christopher [2 ]
Vigna, Giovanni [2 ]
机构
[1] Univ Texas Arlington, Arlington, TX 76019 USA
[2] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
关键词
Review Websites; Fraudulent Reviews and Campaigns; Cross-Dataset Analysis; Change-Point Analysis;
D O I
10.1145/3308558.3313647
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
While online review services provide a two-way conversation between brands and consumers, malicious actors, including misbehaving businesses, have an equal opportunity to distort the reviews for their own gains. We propose OneReview, a method for locating fraudulent reviews, correlating data from multiple crowd-sourced review sites. Our approach utilizes Change Point Analysis to locate points at which a business' reputation shifts. Inconsistent trends in reviews of the same businesses across multiple websites are used to identify suspicious reviews. We then extract an extensive set of textual and contextual features from these suspicious reviews and employ supervised machine learning to detect fraudulent reviews. We evaluated OneReview on about 805K and 462K reviews from Yelp and TripAdvisor, respectively to identify fraud on Yelp. Supervised machine learning yields excellent results, with 97% accuracy. We applied the created model on suspicious reviews and detected about 62K fraudulent reviews (about 8% of all the Yelp reviews). We further analyzed the detected fraudulent reviews and their authors, and located several spam campaigns in the wild, including campaigns against specific businesses, as well as campaigns consisting of several hundreds of socially-networked untrustworthy accounts.
引用
收藏
页码:3108 / 3115
页数:8
相关论文
共 50 条
  • [41] Metric Embedding Autoencoders for Unsupervised Cross-Dataset Transfer Learning
    Potapov, Alexey
    Rodionov, Sergey
    Latapie, Hugo
    Fenoglio, Enzo
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 289 - 299
  • [42] Memory Integrity of CNNs for Cross-Dataset Facial Expression Recognition
    Tannugi, Dylan C.
    Britto, Alceu S., Jr.
    Koerich, Alessandro L.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3826 - 3831
  • [43] Cross-dataset Time Series Anomaly Detection for Cloud Systems
    Zhang, Xu
    Lin, Qingwei
    Xu, Yong
    Qin, Si
    Zhang, Hongyu
    Qiao, Bo
    Dang, Yingnong
    Yang, Xinsheng
    Cheng, Qian
    Chintalapati, Murali
    Wu, Youjiang
    Hsieh, Ken
    Sui, Kaixin
    Meng, Xin
    Xu, Yaohai
    Zhang, Wenchi
    Shen, Furao
    Zhang, Dongmei
    [J]. PROCEEDINGS OF THE 2019 USENIX ANNUAL TECHNICAL CONFERENCE, 2019, : 1063 - 1076
  • [44] Generating Stylized Features for Single-Source Cross-Dataset Palmprint Recognition With Unseen Target Dataset
    Shao, Huikai
    Li, Pengxu
    Zhong, Dexing
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4911 - 4922
  • [45] Cross-Dataset Adaptation for Instrument Classification in Cataract Surgery Videos
    Paranjape, Jay N.
    Sikder, Shameema
    Patel, Vishal M.
    Vedula, S. Swaroop
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 739 - 748
  • [46] Cross-dataset Learning for Generalizable Land Use Scene Classification
    Gominski, Dimitri
    Gouet-Brunet, Valerie
    Chen, Liming
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1381 - 1390
  • [47] Cross-Dataset Variability Problem in EEG Decoding With Deep Learning
    Xu, Lichao
    Xu, Minpeng
    Ke, Yufeng
    An, Xingwei
    Liu, Shuang
    Ming, Dong
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2020, 14
  • [48] Cross-Dataset Data Augmentation for Convolutional Neural Networks Training
    Gasparetto, Andrea
    Ressi, Dalila
    Bergamasco, Filippo
    Pistellato, Mara
    Cosmo, Luca
    Boschetti, Marco
    Ursella, Enrico
    Albarelli, Andrea
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 910 - 915
  • [49] Deep BiLSTM neural network model for emotion detection using cross-dataset approach
    Joshi, Vaishali M.
    Ghongade, Rajesh B.
    Joshi, Aditi M.
    Kulkarni, Rushikesh V.
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 73
  • [50] SIMPLE DOMAIN ADAPTATION FOR CROSS-DATASET ANALYSES OF BRAIN MRI DATA
    Hofer, Christoph
    Kwitt, Roland
    Hoeller, Yvonne
    Trinka, Eugen
    Uhl, Andreas
    [J]. 2017 IEEE 14TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2017), 2017, : 441 - 445