Catching the drift: Using feature-free case-based reasoning for spam filtering

被引:0
|
作者
Delany, Sarah Jane [1 ]
Bridge, Derek [2 ]
机构
[1] Dublin Inst Technol, Dublin, Ireland
[2] Univ Coll Cork, Cork, Ireland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we compare case-based spam filters, focusing on their resilience to concept drift. In particular, we evaluate how to track concept drift using a case-based spam filter that uses a feature-free distance measure based on text compression. In our experiments, we compare two ways to normalise such a distance measure, finding that the one proposed in [1] performs better. We show that a policy as simple as retaining misclassified examples has a hugely beneficial effect on handling concept drift in spam but, on its own, it results in the case base growing by over 30%. We then compare two different retention policies and two different forgetting policies (one a form of instance selection, the other a form of instance weighting) and find that they perform roughly as well as each other while keeping the case base size constant. Finally, we compare a feature-based textual case-based spam filter with our feature-free approach. In the face of concept drift, the feature-based approach requires the case base to be rebuilt periodically so that we can select a new feature set that better predicts the target concept. We find feature-free approaches to have lower error rates than their feature-based equivalents.
引用
收藏
页码:314 / +
页数:4
相关论文
共 50 条
  • [11] Feature selection for neonatal resuscitation management using case-based reasoning
    Datta, S.K.
    Ghosh, I.
    Samant, R.K.
    Modelling, Measurement and Control C, 2007, 68 (3-4): : 67 - 85
  • [12] A Case-Based Reasoning view of Automated Collaborative Filtering
    Hayes, C
    Cunningham, P
    Smyth, B
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2001, 2080 : 234 - 248
  • [13] A Collaborative Filtering Framework Based on Fuzzy Case-Based Reasoning
    Tyagi, Shweta
    Bharadwaj, Kamal K.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2011), VOL 1, 2012, 130 : 279 - 288
  • [14] Efficient Spectrum Allocation Using Case-Based Reasoning and Collaborative Filtering Approaches
    Reddy, Yenumula B.
    2010 FOURTH INTERNATIONAL CONFERENCE ON SENSOR TECHNOLOGIES AND APPLICATIONS (SENSORCOMM), 2008, : 375 - 380
  • [15] Feature selection and weighing for case-based reasoning system using random forests
    Sekar, Booma Devi
    Wang, Hui
    DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 421 - 429
  • [16] An Improved Collaborative Filtering Recommendation Algorithm Based on Case-Based Reasoning
    Xing, Lei
    Xu, Cunlu
    Wang, Wei
    Kang, Zefu
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 740 - 744
  • [17] A case-based reasoning with feature weights derived by BP network
    Peng, Yan
    Zhuang, Like
    IITA 2007: WORKSHOP ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, PROCEEDINGS, 2007, : 26 - +
  • [18] A case-based reasoning system using feature scaling for computer aided breast cancer
    Elter, M.
    Wittenberg, T.
    Schulz-Wendtland, R.
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2007, 2 : S340 - S342
  • [19] CASE-BASED REASONING
    EHRENBERG, D
    PETERSOHN, H
    WIRTSCHAFTSINFORMATIK, 1994, 36 (02): : 166 - 168
  • [20] CASE-BASED REASONING
    LEHNERT, W
    AI MAGAZINE, 1990, 11 (03) : 29 - 29