Improved Fake Reviews Detection Model Based on Vertical Ensemble Tri-Training and Active Learning

被引:6
|
作者
Yin, Chunyong [1 ]
Cuan, Haoqi [1 ]
Zhu, Yuhang [1 ]
Yin, Zhichao [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, 219 Ningliu Rd, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Forestry Univ, 159 Lonpan Rd, Nanjing, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Fake reviews; tri-training; iterative classifiers; active learning; label accuracy; SPAM DETECTION; FRAMEWORK;
D O I
10.1145/3450285
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
People's increasingly frequent online activity has generated a large number of reviews, whereas fake reviews can mislead users and harm their personal interests. In addition, it is not feasible to label reviews on a large scale because of the high cost of manual labeling. Therefore, to improve the detection performance by utilizing the unlabeled reviews, this article proposes a fake reviews detection model based on vertical ensemble tri-training and active learning (VETT-AL). The model combines the features of review text with the user behavior features as feature extraction. In the VETT-AL algorithm, the iterative process is divided into two parts: vertical integration within the group and horizontal integration among the groups. The intra-group integration is to integrate three original classifiers by using the previous iterative models of the classifiers. The inter-group integration is to adopt the active learning based on entropy to select the data with the highest confidence and label it, and as the result of that, the second generation classifiers are trained by the traditional process to improve the accuracy of the label. Experimental results show that the proposed model has a good classification performance.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] An Improved Social Spammer Detection Based on Tri-training
    Xu, Guangxia
    Zhao, Jingteng
    Huang, Deling
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 4040 - 4042
  • [2] Web Spam Detection Based on Improved Tri-training
    Li, Hailong
    [J]. PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2014, : 61 - 65
  • [3] Tri-training Based on Neural Network Ensemble Algorithm
    Zhang, Xiaojie
    Bai, Bendu
    Li, Ying
    [J]. INTELLIGENT SCIENCE AND INTELLIGENT DATA ENGINEERING, ISCIDE 2011, 2012, 7202 : 43 - 49
  • [4] A Tri-training based Transfer Learning Algorithm
    Liu, Xiaobo
    Zhang, Harry
    Cai, Zhihua
    Wang, Guangjun
    [J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 698 - 703
  • [5] An Improved Algorithm for Relation Extraction Based on Tri-Training
    Zhong, Zhinong
    Liu, FangChi
    Wu, Ye
    Jing, Ning
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1077 - 1080
  • [6] Semi-supervised active learning algorithm for SVMs based on QBC and tri-training
    Hailong Xu
    Longyue Li
    Pengsong Guo
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 8809 - 8822
  • [7] Semi-supervised active learning algorithm for SVMs based on QBC and tri-training
    Xu, Hailong
    Li, Longyue
    Guo, Pengsong
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (09) : 8809 - 8822
  • [8] Semi-supervised Classification Model Based on Ladder Network and Improved Tri-training
    Mo, Jian-Wen
    Jia, Peng
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (08): : 2088 - 2096
  • [9] Ensemble learning-based model for fake news detection
    Toumi, Chahrazad
    Bouramoul, Abdelkrim
    [J]. 4th International Conference on Pattern Analysis and Intelligent Systems, PAIS 2022 - Proceedings, 2022,
  • [10] Tri-training based learning from positive and unlabeled data
    Zhang, Bangzuo
    Zuo, Wanli
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING AND 2008 INTERNATIONAL PACIFIC WORKSHOP ON WEB MINING AND WEB-BASED APPLICATION, 2008, : 640 - 644