Distinguishing between Authentic and Fictitious User-generated Hotel Reviews

被引:0
|
作者
Banerjee, Snehasish [1 ]
Chua, Alton Y. K. [1 ]
Kim, Jung-Jae [2 ]
机构
[1] Nanyang Technol Univ, Wee Kim Wee Sch Commun & Informat, Singapore, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
关键词
text analysis; machine learning; data mining; classification algorithms; PREDICTING DECEPTION; RELEVANCE; WORDS; SELF;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The objective of this paper is to distinguish between authentic and fictitious user-generated hotel reviews. To achieve this objective, it adopts a two-step approach. The first seeks to classify authentic and fictitious reviews by leveraging on their possible textual differences. The second step attempts to identify the textual traits that are unique to authentic and fictitious reviews. For the purpose of this paper, a ground truth dataset of 1,800 reviews, uniformly divided between authentic and fictitious, was created. With respect to the first step, authentic and fictitious reviews were classified by using four forms of textual differences: understandability, level of details, writing style, and cognition indicators. Classification was performed using voting by average probability among logistic regression, C4.5, Support Vector Machine, JRip, and Random Forest classifiers. Using five-fold cross-validation, the proposed approach was found to outperform two existing baselines. Furthermore, with respect to the second step, the textual traits unique to authentic and fictitious reviews were identified using Information Gain, and Chi-squared feature selection techniques. A sequential forward feature selection approach was further adopted to identify the top five features that aid the classification of authentic and fictitious reviews. These include the use of nouns, articles, function words, punctuations, and in particular, exclamation points in reviews. The implications of the results are discussed.
引用
收藏
页码:12 / 18
页数:7
相关论文
共 50 条
  • [21] Harms of inconsistency: The impact of user-generated and marketing-generated photos on hotel booking intentions
    Zhang, Shan
    Liu, Weifang
    Zhang, Tingting
    Han, Wei
    Zhu, Yupeng
    [J]. TOURISM MANAGEMENT PERSPECTIVES, 2024, 51
  • [22] User-Generated Evidence
    Hamilton, Rebecca J.
    [J]. COLUMBIA JOURNAL OF TRANSNATIONAL LAW, 2019, 57 (01): : 1 - 61
  • [23] User-generated content
    Wofford, Jennifer
    [J]. NEW MEDIA & SOCIETY, 2012, 14 (07) : 1236 - 1239
  • [24] User-generated content
    Greenfield, David
    [J]. CONTROL ENGINEERING, 2009, 56 (10) : 2 - 2
  • [25] Digital Marketing and User-Generated Content: A Case Study of Vidago Palace Hotel
    Clara, Irina
    Paiva, Teresa
    Morais, Elisabete Paulo
    [J]. MARKETING AND SMART TECHNOLOGIES, VOL 1, 2022, 279 : 451 - 461
  • [26] Big data for big insights: Investigating language-specific drivers of hotel satisfaction with 412,784 user-generated reviews
    Liu, Yong
    Teichert, Thorsten
    Rossi, Matti
    Li, Hongxiu
    Hu, Feng
    [J]. TOURISM MANAGEMENT, 2017, 59 : 554 - 563
  • [27] On the Relationship between Novelty and Popularity of User-Generated Content
    Carmel, David
    Roitman, Haggai
    Yom-Tov, Elad
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2012, 3 (04)
  • [28] Unveiling user-generated content: Designing websites to best present customer reviews
    Liu, Qianqian
    Karahanna, Elena
    Watson, Richard T.
    [J]. BUSINESS HORIZONS, 2011, 54 (03) : 231 - 240
  • [29] More than the Quantity: The Value of Editorial Reviews for a User-Generated Content Platform
    Deng, Yipu
    Zheng, Jinyang
    Khern-Am-Nuai, Warut
    Kannan, Karthik
    [J]. MANAGEMENT SCIENCE, 2022, 68 (09) : 6865 - 6888
  • [30] Motives for reading and articulating user-generated restaurant reviews on Yelp.com
    Parikh, Anish
    Behnke, Carl
    Vorvoreanu, Mihaela
    Almanza, Barbara
    Nelson, Doug
    [J]. JOURNAL OF HOSPITALITY AND TOURISM TECHNOLOGY, 2014, 5 (02) : 160 - 176