Battering Review Spam Through Ensemble Learning in Imbalanced Datasets

被引:0
|
作者
Khurshid, Faisal [1 ]
Zhu, Yan [1 ]
Hu, Jie [1 ]
Ahmad, Muqeet [1 ]
Ahmad, Mushtaq [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Dept Software Engn, Chengdu Hitech Zone, Xipu Campus,West Pk, Chengdu 611756, Peoples R China
来源
COMPUTER JOURNAL | 2022年 / 65卷 / 07期
关键词
review spam; extreme gradient boosting; bagging; synthetic minority over-sampling technique; resample; DECEPTIVE OPINION SPAM; SYSTEM; MODEL;
D O I
10.1093/comjnl/bxab006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, people's buying or availing services decisions are subject to online available reviews/opinions. The authenticity of these reviews/opinions is dubious, as there exist many fake reviews posted to attain monetary benefits by promoting their own or demoting the competitor's products or services known as review spam. Although the number of spam is relatively less than that of normal reviews in real-life, this class imbalance is a critical concern in review spam detection. The performance degrades when the classifier skew towards the majority class. Moreover, efficient feature selection is essentially needed for this issue. The purpose of this study is to develop a framework based on different effective feature selection along with data balancing techniques. Validation results show that our proposed framework commendably copes up with the review spam issue and a higher precision on the real-life dataset. Further, we tested the sensitivity of our proposed framework using both parametric and non-parametric tests and found it significant.
引用
收藏
页码:1666 / 1678
页数:13
相关论文
共 50 条
  • [1] Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review
    Lenka, Sudhansu R.
    Bisoy, Sukant Kishoro
    Priyadarshini, Rojalina
    Sain, Mangal
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [2] Hybrid ensemble and soft computing approaches for review spam detection on different spam datasets
    Amin, Irtiqa
    Dubey, Mithilesh Kumar
    [J]. MATERIALS TODAY-PROCEEDINGS, 2022, 62 : 4779 - 4787
  • [3] A Heterogeneous Ensemble Learning Framework for Spam Detection in Social Networks with Imbalanced Data
    Zhao, Chensu
    Xin, Yang
    Li, Xuefeng
    Yang, Yixian
    Chen, Yuling
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (03):
  • [4] ASE: Anomaly scoring based ensemble learning for highly imbalanced datasets
    Liang, Xiayu
    Gao, Ying
    Xu, Shanrong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [5] Ensemble learning predicts glass-forming ability under imbalanced datasets
    Cheng, Duan-jie
    Liang, Yong-chao
    Pu, Yuan-wei
    Chen, Qian
    [J]. Computational Materials Science, 2025, 248
  • [6] Sentiment analysis of imbalanced datasets using BERT and ensemble stacking for deep learning
    Habbat, Nassera
    Nouri, Hicham
    Anoun, Houda
    Hassouni, Larbi
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [7] Ensemble of Rotation Trees for Imbalanced Medical Datasets
    Guo, Huaping
    Liu, Haiyan
    Wu, Chang-an
    Liu, Wei
    She, Wei
    [J]. JOURNAL OF HEALTHCARE ENGINEERING, 2018, 2018
  • [8] New Construction of Ensemble Classifiers for Imbalanced Datasets
    Zhai, Yun
    Ruan, Da
    Ma, Nan
    An, Bing
    [J]. JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2012, 18 (5-6) : 599 - 616
  • [9] Active Learning for Imbalanced Datasets
    Aggarwal, Umang
    Popescu, Adrian
    Hudelot, Celine
    [J]. 2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1417 - 1426
  • [10] Enactment of Ensemble Learning for Review Spam Detection on Selected Features
    Khurshid, Faisal
    Zhu, Yan
    Xu, Zhuang
    Ahmad, Mushtaq
    Ahmad, Muqeet
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (01) : 387 - 394