Email Spam Filtering

被引:15
|
作者
Puertas Sanz, Enrique [1 ]
Gomez Hidalgo, Jose Maria [2 ]
Cortizo Perez, Jose Carlos [3 ]
机构
[1] Univ Europea Madrid, Madrid 28670, Spain
[2] Optenet, Madrid 28230, Spain
[3] AINet Solut, Madrid 28943, Spain
关键词
D O I
10.1016/S0065-2458(08)00603-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, email Spam has become an increasingly important problem, with a big economic impact in society. In this work, we present the problem of Spam, how it affects us, and how we can fight against it. We discuss legal, economic, and technical measures used to stop these unsolicited emails. Among all the technical measures, those based on content analysis have been particularly effective in filtering Spam, so we focus on them, explaining how they work in detail. In summary, we explain the structure and the process of different Machine Learning methods used for this task, and how we can make them to be cost sensitive through several methods like threshold optimization, instance weighting, or MetaCost. We also discuss how to evaluate Spam filters using basic metrics, TREC metrics, and the receiver operating characteristic convex bull method, that best suits classification problems in which target conditions are not known, as it is the case. We also describe how actual filters are used in practice. We also present different methods used by spammers to attack Spam filters and what we can expect to find in the coming years in the battle of Spam filters against spammers.
引用
收藏
页码:45 / 114
页数:70
相关论文
共 50 条
  • [31] A suffix tree approach to anti-spam email filtering
    Pampapathi, Rajesh
    Mirkin, Boris
    Levene, Mark
    MACHINE LEARNING, 2006, 65 (01) : 309 - 338
  • [32] Cost-sensitive three-way email spam filtering
    Bing Zhou
    Yiyu Yao
    Jigang Luo
    Journal of Intelligent Information Systems, 2014, 42 : 19 - 45
  • [33] An Intelligent Spam Email Filtering Approach Using a Learning Classifier System
    Al-Ajeli, Ahmed
    Al-Shamery, Eman S.
    Alubady, Raaid
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2022, 22 (03) : 233 - 244
  • [34] Applicability of machine learning in spam and phishing email filtering: review and approaches
    Tushaar Gangavarapu
    C. D. Jaidhar
    Bhabesh Chanduka
    Artificial Intelligence Review, 2020, 53 : 5019 - 5081
  • [35] Applicability of machine learning in spam and phishing email filtering: review and approaches
    Gangavarapu, Tushaar
    Jaidhar, C. D.
    Chanduka, Bhabesh
    ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (07) : 5019 - 5081
  • [36] Spam email filtering with Bayesian belief network: using relevant words
    Jin, Xin
    Xu, Anbang
    Bie, Rongfang
    Shen, Xian
    Yin, Min
    2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 238 - +
  • [37] Multistage Email Spam Filtering Based on Three-Way Decisions
    Li, Jianlin
    Deng, Xiaofei
    Yao, Yiyu
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY: 8TH INTERNATIONAL CONFERENCE, 2013, 8171 : 313 - 324
  • [38] Feature Selection and Similarity Coefficient Based Method for Email Spam Filtering
    Abdelrahim, Ali Ahmed A.
    Elhadi, Ammar Ahmed E.
    Ibrahim, Hamza
    Elmisbah, Naser
    2013 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRICAL AND ELECTRONICS ENGINEERING (ICCEEE), 2013, : 630 - 633
  • [39] Content-based concept drift detection for Email spam filtering
    Zi Hayat M.
    Basiri J.
    Seyedhossein L.
    Shakery A.
    2010 5th International Symposium on Telecommunications, IST 2010, 2010, : 531 - 536
  • [40] Cost-sensitive three-way email spam filtering
    Zhou, Bing
    Yao, Yiyu
    Luo, Jigang
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2014, 42 (01) : 19 - 45