Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model

被引:5
|
作者
Weng, Chia-Hsien [1 ]
Lin, Kuan-Cheng [1 ]
Ying, Jia-Ching [1 ]
机构
[1] Natl Chung Hsing Univ, Dept Management Informat Syst, Taichung 402, Taiwan
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 07期
关键词
natural language processing; detection of deceptive reviews; language model; deep learning; BERT;
D O I
10.3390/app12073338
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The advancement of the Internet has changed people's ways of expressing and sharing their views with the world. Moreover, user-generated content has become a primary guide for customer purchasing decisions. Therefore, motivated by commercial interest, some sellers have started manipulating Internet ratings by writing false positive reviews to encourage the sale of their goods and writing false negative reviews to discredit competitors. These reviews are generally referred to as deceptive reviews. Deceptive reviews mislead customers in purchasing goods that are inconsistent with online information and thus obstruct fair competition among businesses. To protect the right of consumers and sellers, an effective method is required to automate the detection of misleading reviews. Previously developed methods of translating text into feature vectors usually fail to interpret polysemous words, which leads to such functions being obstructed. By using dynamic feature vectors, the present study developed several misleading review-detection models for the Chinese language. The developed models were then compared with the standard detection-efficiency models. The deceptive reviews collected from various online forums in Taiwan by previous studies were used to test the models. The results showed that the models proposed in this study can achieve 0.92 in terms of precision, 0.91 in terms of recall, and 0.91 in terms of F1-score. The improvement rate of our proposal is higher than 20%. Accordingly, we prove that our proposal demonstrated improved performance in detecting misleading reviews, and the models based on dynamic feature vectors were capable of more accurately capturing semantic terms than the conventional models based on the static feature vectors, thereby enhancing effectiveness.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Software Vulnerabilities Detection Based on a Pre-trained Language Model
    Xu, Wenlin
    Li, Tong
    Wang, Jinsong
    Duan, Haibo
    Tang, Yahui
    [J]. 2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
  • [2] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    [J]. CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [3] Hyperbolic Pre-Trained Language Model
    Chen, Weize
    Han, Xu
    Lin, Yankai
    He, Kaichen
    Xie, Ruobing
    Zhou, Jie
    Liu, Zhiyuan
    Sun, Maosong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
  • [4] Lawformer: A pre-trained language model for Chinese legal long documents
    Xiao, Chaojun
    Hu, Xueyu
    Liu, Zhiyuan
    Tu, Cunchao
    Sun, Maosong
    [J]. AI OPEN, 2021, 2 : 79 - 84
  • [5] Pre-trained language model augmented adversarial training network for Chinese clinical event detection
    Zhang, Zhichang
    Zhang, Minyu
    Zhou, Tong
    Qiu, Yanlong
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (04) : 2825 - 2841
  • [6] AnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation
    Tian, Huishuang
    Yang, Kexin
    Liu, Dayiheng
    Lv, Jiancheng
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [7] JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
    Zhao, Wayne Xin
    Zhou, Kun
    Gong, Zheng
    Zhang, Beichen
    Zhou, Yuanhang
    Sha, Jing
    Chen, Zhigang
    Wang, Shijin
    Liu, Cong
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4571 - 4581
  • [8] EMBERT: A Pre-trained Language Model for Chinese Medical Text Mining
    Cai, Zerui
    Zhang, Taolin
    Wang, Chengyu
    He, Xiaofeng
    [J]. WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 242 - 257
  • [9] SiBert: Enhanced Chinese Pre-trained Language Model with Sentence Insertion
    Chen, Jiahao
    Cao, Chenjie
    Jiang, Xiuyan
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2405 - 2412
  • [10] Pre-trained Model Based Feature Envy Detection
    Ma, Wenhao
    Yu, Yaoxiang
    Ruan, Xiaoming
    Cai, Bo
    [J]. 2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 430 - 440