Machine Cleaning of Online Opinion Spam: Developing a Machine-Learning Algorithm for Detecting Deceptive Comments

被引:13
|
作者
Oh, Yu Won [1 ]
Park, Chong Hyun [2 ]
机构
[1] Sejong Univ, Dept Media & Commun, Seoul, South Korea
[2] Sejong Univ, Coll Business Adm, 209 Neungdong Ro, Seoul 05006, South Korea
关键词
deceptive comments; fake comments; machine learning; opinion spam; SITUATIONAL THEORY;
D O I
10.1177/0002764219878238
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
Humans are not very good at detecting deception. The problem is that there is currently no other particular way to distinguish fake opinions in a comments section than by resorting to poor human judgments. For years, most scholarly and industrial efforts have been directed at detecting fake consumer reviews of products or services. A technique for identifying deceptive opinions on social issues is largely underexplored and undeveloped. Inspired by the need for a reliable deceptive comment detection method, this study aims to develop an automated machine-learning technique capable of determining opinion trustworthiness in a comment section. In the process, we have created the first large-scale ground truth dataset consisting of 866 truthful and 869 deceptive comments on social issues. This is also one of the first attempts to detect comment deception in Asian languages (in Korean, specifically). The proposed machine-learning technique achieves nearly 81% accuracy in detecting untruthful opinions about social issues. This performance is quite consistent across issues and well beyond that of human judges.
引用
收藏
页码:389 / 403
页数:15
相关论文
共 50 条
  • [1] Detecting Malicious Spam Mails: An Online Machine Learning Approach
    Dai, Yuli
    Tada, Shunsuke
    Ban, Tao
    Nakazato, Junji
    Shimamura, Jumpei
    Ozawa, Seiichi
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2014, PT III, 2014, 8836 : 365 - 372
  • [2] IMPROVEMENTS IN A MACHINE-LEARNING ALGORITHM FOR DETECTING STATUS EPILEPTICUS
    Kamousi, Baharan
    Gupta, Archit
    Karunakaran, Suganya
    Marjaninejad, Ali
    Woo, Raymond
    Parvizi, Josef
    [J]. CRITICAL CARE MEDICINE, 2024, 52
  • [3] A machine-learning algorithm for detecting seizure termination in scalp EEG
    Shoeb, Ali
    Kharbouch, Alaa
    Soegaard, Jacqueline
    Schachter, Steven
    Guttag, John
    [J]. EPILEPSY & BEHAVIOR, 2011, 22 : S36 - S43
  • [4] Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach
    Wang, Alex Hai
    [J]. DATA AND APPLICATIONS SECURITY AND PRIVACY XXIV, PROCEEDINGS, 2010, 6166 : 335 - 342
  • [5] Developing Machine-Learning Prediction Algorithm for Bacteremia in Admitted Patients
    Mahmoud, Ebrahim
    Al Dhoayan, Mohammed
    Bosaeed, Mohammad
    Al Johani, Sameera
    Arabi, Yaseen M.
    [J]. INFECTION AND DRUG RESISTANCE, 2021, 14 : 757 - 765
  • [6] A Method for Fast Selection of Machine-Learning Classifiers for Spam Filtering
    Rapacz, Sylwia
    Cholda, Piotr
    Natkaniec, Marek
    [J]. ELECTRONICS, 2021, 10 (17)
  • [7] Machine-Learning Techniques for Detecting Attacks in SDN
    Elsayed, Mahmoud Said
    Nhien-An Le-Khac
    Dev, Soumyabrata
    Jurcut, Anca Delia
    [J]. PROCEEDINGS OF 2019 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2019), 2019, : 277 - 281
  • [8] Detecting Spam Tweets Using Machine Learning and Effective Preprocessing
    Kardas, Berk
    Bayar, Ismail Erdem
    Ozyer, Tansel
    Alhajj, Reda
    [J]. PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2021, 2021, : 393 - 398
  • [9] A machine-learning approach to detecting unknown bacterial serovars
    Akova F.
    Dundar M.
    Davisson V.J.
    Hirleman E.D.
    Bhunia A.K.
    Robinson J.P.
    Rajwa B.
    [J]. Statistical Analysis and Data Mining, 2010, 3 (05): : 289 - 301
  • [10] ACCURACY OF A MACHINE-LEARNING ALGORITHM IN DETECTING STATUS EPILEPTICUS ON POINT-OF-CARE EEG
    Desai, Masoom
    Hussein, Omar
    Aparicio, Mariel
    Struck, Aaron
    [J]. CRITICAL CARE MEDICINE, 2024, 52