Active Learning based Spam Filtering Method

被引：0

作者：

Zhang, Wei ^{[1
]}

Gao, Feng ^{[1
]}

Lv, Di ^{[1
]}

Xue, Feng ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, MOE KLINNS Lab, Xian 710049, Shaanxi Provinc, Peoples R China

来源：

2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA) | 2010年

关键词：

Active learning; Spam filtering; Text categorization;

D O I：

10.1109/WCICA.2010.5553918

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Internet security is seriously threatened by spam spreading, and content-based spam filtering has become one of effective spam-filtering methods. Aiming at the practical problems, we propose an active learning based method which takes naive Bayesian means as basic classifiers. This method randomly initialize a small training set to generate basic classifiers, and then use them to classify mails, which add the most uncertain mail to training set each time to improve the classifier performance. The simulations based on the CCERT mail set show that this method not only reduces the number of mails to be labeled, but also improves classifier accuracy.

引用

页码：3302 / 3306

页数：5

共 15 条

[1] [Anonymous], 2005, NIPS
[2] Committee-based sample selection for probabilistic classifiers
Argamon-Engelson, S
Dagan, I
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 : 335 - 360
[3] IMPROVING GENERALIZATION WITH ACTIVE LEARNING
COHN, D
ATLAS, L
LADNER, R
[J]. MACHINE LEARNING, 1994, 15 (02) : 201 - 221
[4] Dagan I., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P150
[5] Selective sampling using the query by committee algorithm
Freund, Y
Seung, HS
Shamir, E
Tishby, N
[J]. MACHINE LEARNING, 1997, 28 (2-3) : 133 - 168
[6] Kääriäinen M, 2006, LECT NOTES ARTIF INT, V4264, P63
[7] Lewis D. D., 1994, SIGIR '94. Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, P3
[8] McCallum Andrew K., 1998, ICML, P350
[9] Melville P., 2004, P 21 INT C MACH LEAR
[10] Active learning with multiple views
Muslea, Ion
Minton, Steven
Knoblock, Craig A.
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2006, 27 : 203 - 233

← 1 2 →