An Analysis of Machine Learning Methods for Spam Host Detection

被引:11
|
作者
Silva, Renato M. [1 ]
Yamakami, Akebo [1 ]
Almeida, Tiago A. [2 ]
机构
[1] Univ Campinas UNICAMP, Sch Elect & Comp Engn, Sao Paulo, Brazil
[2] Fed Univ Sao Carlos UFSCar, Dept Comp Sci, Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
spamdexing; web spam; spam host; classification;
D O I
10.1109/ICMLA.2012.161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web is becoming an increasingly important source of entertainment, communication, research, news and trade. In this way, the web sites compete to attract the attention of users and many of them achieve visibility through malicious strategies that try to circumvent the search engines. Such sites are known as web spam and they are generally responsible for personal injury and economic losses. Given this scenario, this paper presents a comprehensive performance evaluation of several established machine learning techniques used to automatically detect and filter hosts that disseminate web spam. Our experiments were diligently designed to ensure statistically sounds results and they indicate that bagging of decision trees, multilayer perceptron neural networks, random forest and adaptive boosting of decision trees are promising in the task of web spam classification and, hence, they can be used as a good baseline for further comparison.
引用
收藏
页码:227 / 232
页数:6
相关论文
共 50 条
  • [1] Detection of Spam E-mails with Machine Learning Methods
    Karamollaoglu, Hamdullah
    Dogru, Ibrahim Alper
    Dorterler, Murat
    [J]. 2018 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2018, : 55 - 59
  • [2] Analysis of Optimized Machine Learning and Deep Learning Techniques for Spam Detection
    Hossain, Fahima
    Uddin, Mohammed Nasir
    Halder, Rajib Kumar
    [J]. 2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 552 - 558
  • [3] Comparative Study of Feature Reduction and Machine Learning Methods for Spam Detection
    Agarwal, Basant
    Mittal, Namita
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2012), 2014, 236 : 761 - 769
  • [4] Comparison of machine learning techniques for spam detection
    Ghosh, Argha
    Senthilrajan, A.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29227 - 29254
  • [5] Machine Learning for the Detection of Spam in Twitter Networks
    Wang, Alex Hai
    [J]. E-BUSINESS AND TELECOMMUNICATIONS, 2012, 222 : 319 - 333
  • [6] A Study of Machine Learning Classifiers for Spam Detection
    Trivedi, Shrawan Kumar
    [J]. 2016 4TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2016, : 176 - 180
  • [7] Comparison of Multiple Machine Learning Approaches and Sentiment Analysis in Detection of Spam
    Alam, A. N. M. Sajedul
    Zaman, Shifat
    Dey, Arnob Kumar
    Bin Kibria, Junaid
    Alam, Zawad
    Mahbub, Mohammed Julfikar Ali
    Mahtab, Md. Motahar
    Rasel, Annajiat Alim
    [J]. ADVANCES IN COMPUTING AND DATA SCIENCES (ICACDS 2022), PT I, 2022, 1613 : 37 - 50
  • [8] Comparison of machine learning techniques for spam detection
    Argha Ghosh
    A. Senthilrajan
    [J]. Multimedia Tools and Applications, 2023, 82 : 29227 - 29254
  • [9] Comparison of Machine Learning Algorithms for Spam Detection
    Sadia, Azeema
    Bashir, Fatima
    Khan, Reema Qaiser
    Bashir, Amna
    Khalid, Ammarah
    [J]. JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (02) : 178 - 184
  • [10] Review Spam Detection using Machine Learning
    Radovanovic, Drasko
    Krstajic, Boza
    [J]. 2018 23RD INTERNATIONAL SCIENTIFIC-PROFESSIONAL CONFERENCE ON INFORMATION TECHNOLOGY (IT), 2018,