Compression-based spam filter

被引:6
|
作者
Almeida, Tiago A. [1 ]
Yamakami, Akebo [2 ]
机构
[1] Fed Univ Sao Carlos UFSCar, Dept Comp Sci, BR-18052780 Sorocaba, SP, Brazil
[2] Univ Campinas UNICAMP, Sch Elect & Comp Engn, BR-13083970 Campinas, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
compression-based model; spam filter; text categorization; knowledge-based system; machine learning; CLASSIFICATION;
D O I
10.1002/sec.639
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, e-mail spam is not a novelty, but it is still an important problem with a high impact on the economy. Spam filtering poses a special problem in text categorization, in which the defining characteristic is that filters face an active adversary, which constantly attempts to evade filtering. In this paper, we present a novel approach to spam filtering based on a compression-based model. We have conducted an empirical experiment on eight public and real non-encoded datasets. The results indicate that the proposed filter is fast to construct, is incrementally updateable, and clearly outperforms established spam classifiers. Copyright (c) 2012 John Wiley & Sons, Ltd.
引用
下载
收藏
页码:327 / 335
页数:9
相关论文
共 50 条
  • [31] Compression-Based Data Augmentation for CNN Generalization
    Benbarrad, Tajeddine
    Kably, Salaheddine
    Arioua, Mounir
    Alaoui, Nabih
    ADVANCES IN CYBERSECURITY, CYBERCRIMES, AND SMART EMERGING TECHNOLOGIES, 2023, 4 : 235 - 244
  • [32] Compression-Based Selective Sampling for Learning to Rank
    Silva, Rodrigo M.
    Gomes, Guilherme C. M.
    Alvim, Mario S.
    Goncalves, Marcos A.
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 247 - 256
  • [33] Adaptive Compression-based Models of Chinese Text
    Teahan, William J.
    Wu, Peiliang
    Liu, Wei
    2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 874 - 881
  • [34] Compression-Based Tools for Navigation with an Image Database
    Di Lillo, Antonella
    Daptardar, Ajay
    Thomas, Kevin
    Storer, James A.
    Motta, Giovanni
    ALGORITHMS, 2012, 5 (01) : 1 - 17
  • [35] Compression-based steganalysis of LSB embedded images
    Boncelet, C
    Marvel, L
    Raglin, A
    SECURITY, STEGANOGRAPHY, AND WATERMARKING OF MULTIMEDIA CONTENTS VIII, 2006, 6072
  • [36] An efficient algorithm for compression-based compressed sensing
    Beygi, Sajjad
    Jalali, Shirin
    Maleki, Arian
    Mitra, Urbashi
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2019, 8 (02) : 343 - 375
  • [37] Statistical Compression-Based Models for Text Classification
    Saikrishna, Vidya
    Dowe, David L.
    Ray, Sid
    2016 FIFTH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS (ICECCS), 2016, : 1 - 6
  • [38] Compression-Based Methods of Time Series Forecasting
    Chirikhin, Konstantin
    Ryabko, Boris
    MATHEMATICS, 2021, 9 (03) : 1 - 11
  • [39] Instantaneous compression-based tyre rolling radius
    Vantsevich, V. V.
    Paldan, J.
    Verma, M.
    DYNAMICS OF VEHICLES ON ROADS AND TRACKS, VOL 1, 2018, : 259 - 264
  • [40] A Compression-Based Technique to Classify Metamorphic Malware
    Ekhtoom, Duaa
    Al-Ayyoub, Mahmoud
    Al-Saleh, Mohammed
    Alsmirat, Mohammad
    Hmeidi, Ismail
    2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,