Spotting Fake Reviews via Collective Positive-Unlabeled Learning

被引:90
|
作者
Li, Huayi [1 ]
Chen, Zhiyuan [1 ]
Liu, Bing [1 ]
Wei, Xiaokai [1 ]
Shao, Jidong [2 ]
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
[2] Dianping Inc, Shanghai, Peoples R China
关键词
Spam Detection; Collective PU Learning;
D O I
10.1109/ICDM.2014.47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online reviews have become an increasingly important resource for decision making and product designing. But reviews systems are often targeted by opinion spamming. Although fake review detection has been studied by researchers for years using supervised learning, ground truth of large scale datasets is still unavailable and most of existing approaches of supervised learning are based on pseudo fake reviews rather than real fake reviews. Working with Dianping(1), the largest Chinese review hosting site, we present the first reported work on fake review detection in Chinese with filtered reviews from Dianping's fake review detection system. Dianping's algorithm has a very high precision, but the recall is hard to know. This means that all fake reviews detected by the system are almost certainly fake but the remaining reviews (unknown set) may not be all genuine. Since the unknown set may contain many fake reviews, it is more appropriate to treat it as an unlabeled set. This calls for the model of learning from positive and unlabeled examples (PU learning). By leveraging the intricate dependencies among reviews, users and IP addresses, we first propose a collective classification algorithm called Multi-typed Heterogeneous Collective Classification (MHCC) and then extend it to Collective Positive and Unlabeled learning (CPU). Our experiments are conducted on real-life reviews of 500 restaurants in Shanghai, China. Results show that our proposed models can markedly improve the F1 scores of strong baselines in both PU and non-PU learning settings. Since our models only use language independent features, they can be easily generalized to other languages.
引用
收藏
页码:899 / 904
页数:6
相关论文
共 50 条
  • [1] Spotting Fake Reviews using Positive-Unlabeled Learning
    Li, Huayi
    Liu, Bing
    Mukherjee, Arjun
    Shao, Jidong
    COMPUTACION Y SISTEMAS, 2014, 18 (03): : 467 - 475
  • [2] GradPU: Positive-Unlabeled Learning via Gradient Penalty and Positive Upweighting
    Dai, Songmin
    Li, Xiaoqiang
    Zhou, Yue
    Ye, Xichen
    Liu, Tong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7296 - +
  • [3] Density Estimators for Positive-Unlabeled Learning
    Basile, Teresa M. A.
    Di Mauro, Nicola
    Esposito, Floriana
    Ferilli, Stefano
    Vergari, Antonio
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, NFMCP 2017, 2018, 10785 : 49 - 64
  • [4] Generative Adversarial Positive-Unlabeled Learning
    Hou, Ming
    Chaib-draa, Brahim
    Li, Chao
    Zhao, Qibin
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2255 - 2261
  • [5] Positive-Unlabeled Learning in Streaming Networks
    Chang, Shiyu
    Zhang, Yang
    Tang, Jiliang
    Yin, Dawei
    Chang, Yi
    Hasegawa-Johnson, Mark A.
    Huang, Thomas S.
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 755 - 764
  • [6] Positive-Unlabeled Learning for Knowledge Distillation
    Ning Jiang
    Jialiang Tang
    Wenxin Yu
    Neural Processing Letters, 2023, 55 : 2613 - 2631
  • [7] Positive-Unlabeled Learning for Knowledge Distillation
    Jiang, Ning
    Tang, Jialiang
    Yu, Wenxin
    NEURAL PROCESSING LETTERS, 2023, 55 (03) : 2613 - 2631
  • [8] A boosting framework for positive-unlabeled learning
    Zhao, Yawen
    Zhang, Mingzhe
    Zhang, Chenhao
    Chen, Weitong
    Ye, Nan
    Xu, Miao
    STATISTICS AND COMPUTING, 2025, 35 (01)
  • [9] Entropy Weight Allocation: Positive-unlabeled Learning via Optimal Transport
    Gu, Wen
    Zhang, Teng
    Jin, Hai
    PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 37 - 45
  • [10] Positive-Unlabeled Learning With Label Distribution Alignment
    Jiang, Yangbangyan
    Xu, Qianqian
    Zhao, Yunrui
    Yang, Zhiyong
    Wen, Peisong
    Cao, Xiaochun
    Huang, Qingming
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15345 - 15363