Spammer Classification Using Ensemble Methods over Content-Based Features

被引:8
|
作者
Makkar, Aaisha [1 ]
Goel, Shivani [1 ]
机构
[1] Thapar Univ, Comp Sci & Engn Dept, Patiala, Punjab, India
关键词
Web spamming; Machine learning; Boosting; Ensemble;
D O I
10.1007/978-981-10-3325-4_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the web documents are raising at high scale, it is very difficult to access useful information. Search engines play a major role in retrieval of relevant information and knowledge. They deal with managing large amount of information with efficient page ranking algorithms. Still web spammers try to intrude the search engine results by various web spamming techniques for their personal benefit. According to the recent report from Internetlivestats in March (2016), an Internet survey company, states that there are currently 3.4 billion Internet users in the world. From this survey it can be judged that the search engines play a vital role in retrieval of information. In this research, we have investigated fifteen different machine learning classification algorithms over content based features to classify the spam and non spam web pages. Ensemble approach is done by using three algorithms which are computed as best on the basis of various parameters. Ten Fold Cross-validation approach is also used.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [1] Spammer Classification using Ensemble Methods over Structural Social Network Features
    Bhat, Sajid Yousuf
    Abulaish, Muhammad
    Mirza, Abdulrahman A.
    [J]. 2014 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2, 2014, : 454 - 458
  • [2] Content-Based Music Classification Using Ensemble of Classifiers
    Anisetty, Manikanta Durga Srinivas
    Shetty, Gagan K.
    Hiriyannaiah, Srinidhi
    Matt, Siddesh Gaddadevara
    Srinivasa, K. G.
    Kanavalli, Anita
    [J]. INTELLIGENT HUMAN COMPUTER INTERACTION, 2018, 11278 : 285 - 292
  • [3] Content-based mobile spam classification using stylistically motivated features
    Sohn, Dae-Neung
    Lee, Jung-Tae
    Han, Kyoung-Soo
    Rim, Hae-Chang
    [J]. PATTERN RECOGNITION LETTERS, 2012, 33 (03) : 364 - 369
  • [4] Content-based classification of breath sound with enhanced features
    Lei, Baiying
    Rahman, Shah Atiqur
    Song, Insu
    [J]. NEUROCOMPUTING, 2014, 141 : 139 - 147
  • [5] Single-labelled Music Genre Classification Using Content-Based Features
    Ajoodha, Ritesh
    Klein, Richard
    Rosman, Benjamin
    [J]. PROCEEDINGS OF THE 2015 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2015, : 66 - 71
  • [6] Multi-label Emotion Classification using Content-Based Features in Twitter
    Ameer, Iqra
    Ashraf, Noman
    Sidorov, Grigori
    Gomez-Adorno, Helena
    [J]. COMPUTACION Y SISTEMAS, 2020, 24 (03): : 1159 - 1164
  • [7] Emotional Bots: Content-based Spammer Detection on Social Media
    Andriotis, Panagiotis
    Takasu, Atsuhiro
    [J]. 2018 10TH IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2018,
  • [8] Fissures Segmentation Using Surface Features: Content-based Retrieval for Mammographic Mass Using Ensemble Classifier
    Liu, Hong
    Lan, Yihua
    Xu, Xiangyang
    Song, Enmin
    Hung, Chih-Cheng
    [J]. ACADEMIC RADIOLOGY, 2011, 18 (12) : 1475 - 1484
  • [9] Content-based fake news classification through modified voting ensemble
    Bezerra, Jose Fabio Ribeiro
    [J]. JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2021, 5 (04) : 499 - 513
  • [10] Automated Classification of Extremist Twitter Accounts Using Content-Based and Network-Based Features
    Xie, Daniel
    Xu, Jiejun
    Lu, Tsai-Ching
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2545 - 2549