Automatic junk e-mail filtering based on latent content

被引:6
|
作者
Bellegarda, JR [1 ]
Naik, D [1 ]
Silverman, KEA [1 ]
机构
[1] Apple Comp Inc, Spoken Language Grp, Cupertino, CA 95014 USA
关键词
D O I
10.1109/ASRU.2003.1318485
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The explosion in unsolicited mass electronic mail (junk e-mail) over the past decade has sparked interest in automatic filtering solutions. Traditional techniques tend to rely on header analysis, keyword/keyphrase matching and analogous rule-based predicates, and/or some probabilistic model of text generation. This paper aims instead at deciding whether or not the latent subject matter is consistent with the user's interests. The underlying framework is latent semantic analysis: each e-mail is automatically classified against two semantic anchors, one for legitimate and one for junk messages. Experiments show that this approach is competitive with the state-of-the-art in e-mail classification, and potentially advantageous in real-world applications with high junk-to-legitimate ratios. The resulting technology has been successfully released in August 2002 as part of the e-mail client bundled with the MacOS 10.2 operating system.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [31] An efficient method for filtering image-based spam e-mail
    Nhung, Ngo Phuong
    Phuong, Tu Minh
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2007, 4673 : 945 - 953
  • [32] Learning to filter junk e-mail from positive and unlabeled examples
    Schneider, KM
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 426 - 435
  • [33] Content-based E-mail auditing system implementation
    Cao, Jiuxin
    Zhang, Deyun
    Wu, Zhan
    Liu, Weina
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2002, 36 (06): : 608 - 611
  • [34] Mitigating E-mail threats - A web content based application
    Dhanalakshmi, R.
    Chellappan, C.
    Lecture Notes in Engineering and Computer Science, 2012, 2195 : 632 - 637
  • [35] A Junk Mail Filtering Method Based on LSA and FSVM
    Sun, Jing-tao
    Zhang, Qiu-yu
    Yuan, Zhan-ting
    Huang, Wen-han
    Yan, Xiao-wen
    Dong, Xan-she
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2008, : 111 - +
  • [36] An e-mail filtering approach using neural network
    Cao, YK
    Liao, XF
    Li, YF
    ADVANCES IN NEURAL NETWORKS - ISNN 2004, PT 2, 2004, 3174 : 688 - 694
  • [37] Collaborative spam filtering using e-mail networks
    Kong, Joseph S.
    Rezaei, Behnam A.
    Sarshar, Nima
    Roychowdhury, Vwani P.
    Boykin, P. Oscar
    COMPUTER, 2006, 39 (08) : 67 - +
  • [38] The origin, content, and workload of e-mail consultations
    Borowitz, SM
    Wyatt, JC
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1998, 280 (15): : 1321 - 1324
  • [39] A Multiobjective Evolutionary Algorithm for Spam E-mail Filtering
    Lopez-Herrera, A. G.
    Herrera-Viedma, E.
    Herrera, F.
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 366 - +
  • [40] Spam collaborative filtering in Enron e-mail network
    Yang, Zhen
    Lai, Ying-Xu
    Duan, Li-Juan
    Li, Yu-Jian
    Xu, Xin
    Zidonghua Xuebao/Acta Automatica Sinica, 2012, 38 (03): : 399 - 411