Detecting Comment Spam through Content Analysis

被引:0
|
作者
Huang, Congrui [1 ]
Jiang, Qiancheng [1 ]
Zhang, Yan [1 ]
机构
[1] Peking Univ Beijing, Key Lab Machine Percept, Minist Educ, Sch Elect Engn & Comp Sci, Beijing, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the Web 2.0 eras, the individual Internet users can also act as information providers, releasing information or making comments conveniently. However, some participants may spread irresponsible remarks or express irrelevant comments for commercial interests. This kind of so-called comment spam severely hurts the information quality. This paper tries to automatically detect comment spam through content analysis, using some previously-undescribed features. Experiments on a real data set show that our combined heuristics can correctly identify comment spam with high precision(90.4%) and recall(84.5%).
引用
收藏
页码:222 / 233
页数:12
相关论文
共 50 条
  • [1] Characterizing Comment Spam in the Blogosphere through Content Analysis
    Bhattarai, Archana
    Rus, Vasile
    Dasgupta, Dipankar
    [J]. IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN CYBER SECURITY, 2009, : 37 - 44
  • [2] Detecting Spam WebPages through Topic and Semantics Analysis
    Wan, Jing
    Liu, Mufan
    Yi, Junkai
    Zhang, Xuechao
    [J]. 2015 GLOBAL SUMMIT ON COMPUTER & INFORMATION TECHNOLOGY (GSCIT), 2015,
  • [3] Detecting Spam Review through Spammer's Behavior Analysis
    Hussain, Naveed
    Mirza, Hamid Turab
    Hussain, Ibrar
    [J]. ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2019, 8 (02): : 61 - 71
  • [4] Content trust model for detecting web spam
    Wang, Wei
    Zeng, Guosun
    [J]. TRUST MANAGEMENT, 2007, 238 : 139 - +
  • [5] A Self-Supervised Approach to Comment Spam Detection Based on Content Analysis
    Bhattarai, A.
    Dasgupta, D.
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2011, 5 (01) : 14 - 32
  • [6] Detecting spam through their Sender Policy Framework records
    Sipahi, Devrim
    Dalkilic, Gokhan
    Ozcanhan, Mehmet Hilal
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2015, 8 (18) : 3555 - 3563
  • [7] Effectively Detecting Content Spam on the Web Using Topical Diversity Measures
    Dong, Cailing
    Zhou, Bin
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 266 - 273
  • [8] A structural, content-similarity measure for detecting spam documents on the web
    Pera, Maria Soledad
    Yiu-Kai Ng
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2009, 5 (04) : 431 - 464
  • [9] Fighting WebSpam: Detecting spam on the graph via content and link features
    Yang, Yu-Jiu
    Yang, Shuang-Hong
    Hu, Bao-Gang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 1049 - 1055
  • [10] Detecting Spam Bots by Sequential Analysis of Encrypted Traffic
    Lin, Po-Ching
    Chen, Chi-Fang
    Chiou, Pin-Ren
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2016, 17 (06): : 1279 - 1286