Catching the drift: Using feature-free case-based reasoning for spam filtering

被引:0
|
作者
Delany, Sarah Jane [1 ]
Bridge, Derek [2 ]
机构
[1] Dublin Inst Technol, Dublin, Ireland
[2] Univ Coll Cork, Cork, Ireland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we compare case-based spam filters, focusing on their resilience to concept drift. In particular, we evaluate how to track concept drift using a case-based spam filter that uses a feature-free distance measure based on text compression. In our experiments, we compare two ways to normalise such a distance measure, finding that the one proposed in [1] performs better. We show that a policy as simple as retaining misclassified examples has a hugely beneficial effect on handling concept drift in spam but, on its own, it results in the case base growing by over 30%. We then compare two different retention policies and two different forgetting policies (one a form of instance selection, the other a form of instance weighting) and find that they perform roughly as well as each other while keeping the case base size constant. Finally, we compare a feature-based textual case-based spam filter with our feature-free approach. In the face of concept drift, the feature-based approach requires the case base to be rebuilt periodically so that we can select a new feature set that better predicts the target concept. We find feature-free approaches to have lower error rates than their feature-based equivalents.
引用
收藏
页码:314 / +
页数:4
相关论文
共 50 条
  • [41] Adaptive spam filtering using dynamic feature space
    Zhou, Y
    Mulekar, MS
    Nerellapalli, P
    ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 302 - 309
  • [42] Simplifying case retrieval in case-based reasoning using ontologies
    Castro, JL
    Sánchez, JM
    Zurita, JM
    PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, 2005, : 222 - 226
  • [43] Case Studies on the Clinical Applications using Case-Based Reasoning
    Ahmed, Mobyen Uddin
    Begum, Shahina
    Funk, Peter
    2012 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2012, : 3 - 10
  • [44] Joining Case-based Reasoning and Item-based Collaborative Filtering in Recommender Systems
    Gong, SongJie
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL I, 2009, : 40 - 42
  • [45] Trust-Enhanced Recommender System based on Case-based Reasoning and Collaborative Filtering
    Tyagi, Shweta
    Bharadwaj, Kamal K.
    2012 2ND INTERNATIONAL CONFERENCE ON POWER, CONTROL AND EMBEDDED SYSTEMS (ICPCES 2012), 2012,
  • [46] An Electronic Commerce Recommendation Algorithm Joining Case-Based Reasoning and Collaborative Filtering
    Wu, Dongyan
    PROCEEDINGS OF THE 2015 INTERNATIONAL INDUSTRIAL INFORMATICS AND COMPUTER ENGINEERING CONFERENCE, 2015, : 1189 - 1192
  • [47] Development of a Computational Recommender Algorithm for Digital Resources for Education Using Case-Based Reasoning and Collaborative Filtering
    Gutierrez, Guadalupe
    Margain, Lourdes
    Ochoa, Alberto
    Rojas, Jesus
    DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2012, 151 : 767 - 774
  • [48] Unsupervised case classification using Kohonen 'self-organizing feature map' in a case-based reasoning system
    Manickam, S
    Abidi, SSR
    IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : A524 - A527
  • [49] Automatic Knowledge Learning Using Case-Based Reasoning
    de Souza, Viviane Dal Molin
    Borges, Andre Pinz
    Vecino Sato, Denise Maria
    Avila, Braulio Coelho
    Scalabrin, Edson Emilio
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4579 - 4585
  • [50] Cancer classification using case-based reasoning classifier
    Machcha, Lilybert
    Lhattacharya, Prabir
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 3602 - +