Distinguishing between Authentic and Fictitious User-generated Hotel Reviews

被引:0
|
作者
Banerjee, Snehasish [1 ]
Chua, Alton Y. K. [1 ]
Kim, Jung-Jae [2 ]
机构
[1] Nanyang Technol Univ, Wee Kim Wee Sch Commun & Informat, Singapore, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
关键词
text analysis; machine learning; data mining; classification algorithms; PREDICTING DECEPTION; RELEVANCE; WORDS; SELF;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The objective of this paper is to distinguish between authentic and fictitious user-generated hotel reviews. To achieve this objective, it adopts a two-step approach. The first seeks to classify authentic and fictitious reviews by leveraging on their possible textual differences. The second step attempts to identify the textual traits that are unique to authentic and fictitious reviews. For the purpose of this paper, a ground truth dataset of 1,800 reviews, uniformly divided between authentic and fictitious, was created. With respect to the first step, authentic and fictitious reviews were classified by using four forms of textual differences: understandability, level of details, writing style, and cognition indicators. Classification was performed using voting by average probability among logistic regression, C4.5, Support Vector Machine, JRip, and Random Forest classifiers. Using five-fold cross-validation, the proposed approach was found to outperform two existing baselines. Furthermore, with respect to the second step, the textual traits unique to authentic and fictitious reviews were identified using Information Gain, and Chi-squared feature selection techniques. A sequential forward feature selection approach was further adopted to identify the top five features that aid the classification of authentic and fictitious reviews. These include the use of nouns, articles, function words, punctuations, and in particular, exclamation points in reviews. The implications of the results are discussed.
引用
收藏
页码:12 / 18
页数:7
相关论文
共 50 条
  • [41] A call for 'User-Generated Branding'
    Burmann, Christoph
    [J]. JOURNAL OF BRAND MANAGEMENT, 2010, 18 (01) : 1 - 4
  • [42] The Power of User-Generated Content
    Jagger, Paul
    [J]. ITNOW, 2023, 65 (01) : 32 - 33
  • [43] Mining user-generated comments
    Subercaze, Julien
    Gravier, Christophe
    Laforest, Frederique
    [J]. 2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 1, 2015, : 45 - 52
  • [44] User-generated content and the law
    Holmes, Steve
    Ganley, Paul
    [J]. JOURNAL OF INTELLECTUAL PROPERTY LAW & PRACTICE, 2007, 2 (05) : 338 - 344
  • [45] On the "Localness" of User-Generated Content
    Hecht, Brent
    Gergle, Darren
    [J]. 2010 ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK, 2010, : 229 - 232
  • [46] Interacting User-Generated Content Technologies: How Questions and Answers Affect Consumer Reviews
    Banerjee, Shrabastee
    Dellarocas, Chrysanthos
    Zervas, Georgios
    [J]. JOURNAL OF MARKETING RESEARCH, 2021, 58 (04) : 742 - 761
  • [47] The effects of online reviews on the popularity of user-generated design ideas within the Lego community
    Zhang, Hao
    Lin, Qingyue
    Qi, Chenyue
    Liang, Xiaoning
    [J]. EUROPEAN JOURNAL OF MARKETING, 2022, 56 (10) : 2622 - 2648
  • [48] Impacts of user-generated images in online reviews on customer engagement: A panel data analysis
    Li, Hengyun
    Liu, Hongbo
    Shin, Hyejo Hailey
    Ji, Haipeng
    [J]. TOURISM MANAGEMENT, 2024, 101
  • [49] USER-GENERATED DATA IN CULTURAL MAPPING: ANALYZING GOOGLE POINT OF INTEREST REVIEWS IN DUBLIN
    Rabiei-Dastjerdi, Hamidreza
    McArdle, Gavin
    Aghajani, Mohammad Ali
    [J]. XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION IV, 2022, 5-4 : 107 - 112
  • [50] Leveraging User-Generated Content for Product Promotion: The Effects of Firm-Highlighted Reviews
    Yi, Cheng
    Jiang, Zhenhui
    Li, Xiuping
    Lu, Xianghua
    [J]. INFORMATION SYSTEMS RESEARCH, 2019, 30 (03) : 711 - 725