Spam Filtering Based on Improved CHI Feature Selection Method

被引:0
|
作者
Lu, Zhimao [1 ]
Yu, Hongxia [1 ]
Fan, Dongmei [1 ]
Yuan, Chaoyue [1 ]
机构
[1] Harbin Engn Univ, Pattern Recognit & Nat Computat Lab, Harbin 150001, Peoples R China
关键词
Spam Filtering; Feature Selection; SVM; CHI Square; Modified CHI;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, methods of feature selection used in the spam filtering are studied, including CHI square (CHI), Expected Cross Entropy (ECE), the Weight of Evidence for Text (WET) and Information Gain (IG) and a novel modified CHI feature selection method is proposed in spam filtering. The spam filter combined Support Vector Machine (SVM) is selected to evaluate the CHI square, Expected Cross Entropy, the Weight of Evidence for Text, Information Gain and modified CHI. The experiment proved that the modified CHI could improve the precision, recall and F test measure of spam filter and the modified CHI feature selection method is effective.
引用
收藏
页码:771 / 773
页数:3
相关论文
共 2 条
  • [1] Chen C, 2008, PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON PUBLIC ADMINISTRATION (4TH), VOL II, P16
  • [2] CID I, 2008, LECT NOTES COMPUTER, P288