Filtering Spam by Using Factors Hyperbolic Trees

被引：0

作者：

Hou, Hailong ^{[1
]}

Chen, Yan ^{[1
]}

Beyah, Raheem ^{[1
]}

Zhang, Yan-Qing ^{[1
]}

机构：

[1] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30302 USA

来源：

GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE | 2008年

关键词：

spam; Bayesian algorithm; Ranked Term Frequency; fuzzy logic; factors hyperbolic trees;

D O I：

10.1109/GLOCOM.2008.ECP.362

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Most of current anti-spam techniques, like the Bayesian anti-spam algorithm, primarily use lexical matching for filtering unsolicited bulk E-mails (UBE) and unsolicited commercial E-mails (UCE). However, precision of spam filtering is usually low when the lexical matching algorithms are used in real dynamic environments. For example, an E-mail of refrigerator advertisements is useful for most families, but it is useless for Eskimos. The lexical matching anti-spam algorithms cannot distinguish such processed E-mails that are junk to most people but are useful for others. We propose a Factors Hyperbolic Tree (FHT) based algorithm that, unlike the lexical matching algorithms, handles spam filtering in a dynamic environment by considering various relevant factors. The new Ranked Term Frequency (RTF) algorithm is proposed to extract indicators from E-mails that are related to environmental factors. Type-1 and Type-2 fuzzy logic systems are used to evaluate the indicators and determine whether E-mails are spam based on the environmental factors. Additionally, weights of factors in a FHT database are continuously updated according to dynamic conditional factors in a real environment. Simulation results show that the FHT algorithm filters out spam with high precision. Furthermore, the FHT algorithm is more efficient than other methods when it filters E-mails with complex influencing factors. The main contribution of this paper is that the FHT based algorithm can filter E-mails based on influencing factors instead of matched words to allow dynamic filtering of spam E-mails.

引用

页数：5

共 50 条

[31] Using Personality Recognition Techniques to Improve Bayesian Spam Filtering
Ezpeleta, Enaitz
Zurutuza, Urko
Maria, Jose
Hidalgo, Gomez
PROCESAMIENTO DEL LENGUAJE NATURAL, 2016, (57): : 125 - 132
[32] Spam Filtering: an Active Learning Approach using Incremental Clustering
Georgala, Kleanthi
Kosmopoulos, Aris
Paliouras, George
4TH INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, MINING AND SEMANTICS, 2014,
[33] Spam Email Filtering Using Network-Level Properties
Cortez, Paulo
Correia, Andre
Sousa, Pedro
Rocha, Miguel
Rio, Miguel
ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2010, 6171 : 476 - +
[34] Filtering Chinese Image Spam Using Pseudo-OCR
Xu Bin
Li Ruiguang
Liu Yashu
Yan Hanbing
Li Siyuan
Zhang Honggang
CHINESE JOURNAL OF ELECTRONICS, 2015, 24 (01) : 134 - 139
[35] An SMS Spam Filtering System Using Support Vector Machine
Joe, Inwhee
Shim, Hyetaek
FUTURE GENERATION INFORMATION TECHNOLOGY, 2010, 6485 : 577 - 584
[36] Research on spam filtering technology using Support Vector Machine
Mei, Zheng
Ji, Geng
Xiao, Li
Qiao, Liu
2007 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLS 1 AND 2: VOL 1: COMMUNICATION THEORY AND SYSTEMS; VOL 2: SIGNAL PROCESSING, COMPUTATIONAL INTELLIGENCE, CIRCUITS AND SYSTEMS, 2007, : 492 - +
[37] Unsupervised Approach for Email Spam Filtering using Data Mining
Manaa M.E.
Obaid A.J.
Dosh M.H.
EAI Endorsed Transactions on Energy Web, 2021, 8 (36) : 1 - 6
[38] Using a probable weight based Bayesian approach for spam filtering
Anayat, S
Ali, A
Ahmad, HF
INMIC 2004: 8th International Multitopic Conference, Proceedings, 2004, : 340 - 345
[39] Dimensionality Reduction Applied to Spam Filtering using Bayesian Classifiers
Almeida, Tiago A.
Yamakami, Akebo
REVISTA BRASILEIRA DE COMPUTACAO APLICADA, 2011, 3 (01): : 16 - 29
[40] Layout Based Spam Filtering
Musat, Claudiu N.
PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 12, 2006, 12 : 161 - 164

← 1 2 3 4 5 →