Automatic junk e-mail filtering based on latent content

被引:6
|
作者
Bellegarda, JR [1 ]
Naik, D [1 ]
Silverman, KEA [1 ]
机构
[1] Apple Comp Inc, Spoken Language Grp, Cupertino, CA 95014 USA
关键词
D O I
10.1109/ASRU.2003.1318485
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The explosion in unsolicited mass electronic mail (junk e-mail) over the past decade has sparked interest in automatic filtering solutions. Traditional techniques tend to rely on header analysis, keyword/keyphrase matching and analogous rule-based predicates, and/or some probabilistic model of text generation. This paper aims instead at deciding whether or not the latent subject matter is consistent with the user's interests. The underlying framework is latent semantic analysis: each e-mail is automatically classified against two semantic anchors, one for legitimate and one for junk messages. Experiments show that this approach is competitive with the state-of-the-art in e-mail classification, and potentially advantageous in real-world applications with high junk-to-legitimate ratios. The resulting technology has been successfully released in August 2002 as part of the e-mail client bundled with the MacOS 10.2 operating system.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [41] Automatic and reliable elimination of e-mail loops based on statistical analysis
    Solana, E
    Baggiolini, V
    Ramluckun, M
    Harms, J
    PROCEEDINGS OF THE TENTH SYSTEMS ADMINISTRATION CONFERENCE (LISA X), 1996, : 139 - 144
  • [42] Mining e-mail content for a small enterprise
    Udoh, Emmanuel
    INNOVATIONS AND ADVANCED TECHNIQUES IN COMPUTER AND INFORMATION SCIENCES AND ENGINEERING, 2007, : 179 - 182
  • [43] An improved Bayes algorithm for filtering spam e-mail
    Wang, Meizhen
    Li, Zhitang
    Wu, Hantao
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2009, 37 (08): : 27 - 30
  • [44] A Study on E-mail Image Spam Filtering Techniques
    Dhanaraj, S.
    Karthikeyani, V.
    2013 INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, INFORMATICS AND MEDICAL ENGINEERING (PRIME), 2013,
  • [45] Filtering e-mail based on fuzzy support vector machines and aggregation operator
    Yang, Jilin
    Peng, Hong
    Pei, Zheng
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2006, 4232 : 882 - 891
  • [46] The Research and Implementation of Spam E-mail Filtering Based on Improved Bayesian Algorithm
    Zhang, Sifa
    Zuo, Fengmei
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 132 - 135
  • [47] A text mining agents based architecture for personal e-mail filtering and management
    Zhong, N
    Matsunaga, T
    Liu, CN
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 329 - 336
  • [48] An ensemble design approach based on bagging technique for filtering e-mail spam
    Roy S.S.
    Viswanatham V.M.
    Krishna P.V.
    Roy, Sanjiban Sekhar (s.roy@vit.ac.in), 1600, Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (10): : 247 - 260
  • [49] E-mail
    ASEE PRISM, 3 (07):
  • [50] An E-mail
    代莹莹
    初中生学习(高), 2016, (03) : 45 - 45