Spotting Fake Reviews via Collective Positive-Unlabeled Learning

被引：90

作者：

Li, Huayi ^{[1
]}

Chen, Zhiyuan ^{[1
]}

Liu, Bing ^{[1
]}

Wei, Xiaokai ^{[1
]}

Shao, Jidong ^{[2
]}

机构：

[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA

[2] Dianping Inc, Shanghai, Peoples R China

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2014年

关键词：

Spam Detection; Collective PU Learning;

D O I：

10.1109/ICDM.2014.47

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online reviews have become an increasingly important resource for decision making and product designing. But reviews systems are often targeted by opinion spamming. Although fake review detection has been studied by researchers for years using supervised learning, ground truth of large scale datasets is still unavailable and most of existing approaches of supervised learning are based on pseudo fake reviews rather than real fake reviews. Working with Dianping(1), the largest Chinese review hosting site, we present the first reported work on fake review detection in Chinese with filtered reviews from Dianping's fake review detection system. Dianping's algorithm has a very high precision, but the recall is hard to know. This means that all fake reviews detected by the system are almost certainly fake but the remaining reviews (unknown set) may not be all genuine. Since the unknown set may contain many fake reviews, it is more appropriate to treat it as an unlabeled set. This calls for the model of learning from positive and unlabeled examples (PU learning). By leveraging the intricate dependencies among reviews, users and IP addresses, we first propose a collective classification algorithm called Multi-typed Heterogeneous Collective Classification (MHCC) and then extend it to Collective Positive and Unlabeled learning (CPU). Our experiments are conducted on real-life reviews of 500 restaurants in Shanghai, China. Results show that our proposed models can markedly improve the F1 scores of strong baselines in both PU and non-PU learning settings. Since our models only use language independent features, they can be easily generalized to other languages.

引用

页码：899 / 904

页数：6

共 50 条

[1] Spotting Fake Reviews using Positive-Unlabeled Learning
Li, Huayi
Liu, Bing
Mukherjee, Arjun
Shao, Jidong
COMPUTACION Y SISTEMAS, 2014, 18 (03): : 467 - 475
[2] GradPU: Positive-Unlabeled Learning via Gradient Penalty and Positive Upweighting
Dai, Songmin
Li, Xiaoqiang
Zhou, Yue
Ye, Xichen
Liu, Tong
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7296 - +
[3] Density Estimators for Positive-Unlabeled Learning
Basile, Teresa M. A.
Di Mauro, Nicola
Esposito, Floriana
Ferilli, Stefano
Vergari, Antonio
NEW FRONTIERS IN MINING COMPLEX PATTERNS, NFMCP 2017, 2018, 10785 : 49 - 64
[4] Generative Adversarial Positive-Unlabeled Learning
Hou, Ming
Chaib-draa, Brahim
Li, Chao
Zhao, Qibin
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2255 - 2261
[5] Positive-Unlabeled Learning in Streaming Networks
Chang, Shiyu
Zhang, Yang
Tang, Jiliang
Yin, Dawei
Chang, Yi
Hasegawa-Johnson, Mark A.
Huang, Thomas S.
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 755 - 764
[6] Positive-Unlabeled Learning for Knowledge Distillation
Ning Jiang
Jialiang Tang
Wenxin Yu
Neural Processing Letters, 2023, 55 : 2613 - 2631
[7] Positive-Unlabeled Learning for Knowledge Distillation
Jiang, Ning
Tang, Jialiang
Yu, Wenxin
NEURAL PROCESSING LETTERS, 2023, 55 (03) : 2613 - 2631
[8] A boosting framework for positive-unlabeled learning
Zhao, Yawen
Zhang, Mingzhe
Zhang, Chenhao
Chen, Weitong
Ye, Nan
Xu, Miao
STATISTICS AND COMPUTING, 2025, 35 (01)
[9] Entropy Weight Allocation: Positive-unlabeled Learning via Optimal Transport
Gu, Wen
Zhang, Teng
Jin, Hai
PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 37 - 45
[10] Positive-Unlabeled Learning With Label Distribution Alignment
Jiang, Yangbangyan
Xu, Qianqian
Zhao, Yunrui
Yang, Zhiyong
Wen, Peisong
Cao, Xiaochun
Huang, Qingming
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15345 - 15363

← 1 2 3 4 5 →