Intelligent SMS Spam Filtering Using Topic Model

被引：14

作者：

Ma, Jialin ^{[1
,2
]}

Zhang, Yongjun ^{[1
,2
]}

Liu, Jinling ^{[1
]}

Yu, Kun ^{[1
]}

Wang, XuAn ^{[3
]}

机构：

[1] Huaiyin Inst Technol, Huaian, Peoples R China

[2] Hohai Univ, Nanjing, Jiangsu, Peoples R China

[3] CAPF, Engn Univ, Xian, Peoples R China

来源：

2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS) | 2016年

关键词：

SMS Spam; Topic Model; LDA; MTM;

D O I：

10.1109/INCoS.2016.47

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Nowadays, spam messages have been overflowing in many countries. They seriously violate personal rights, and may even harm the national security. The existing filtering techniques usually uses traditional text classifiers, which are more suitable to deal with normal long texts; therefore, it often faces some serious challenges, such as the sparse data problem and noise data in the SMS message. This research work proposes a message topic model (MTM) for SMS spam filtering. The MTM derives from the famous probability topic model. Although the MTM is based on probability topic model, it is different from the famous standard Latent Dirichlet Allocation (LDA) in the following aspects: (1) For the purpose of overcoming the sparsity problem in SMS message classification, first, the standard K-means algorithm is used to classify the training data into rough classes, then, aggregates all the spam messages of a class into a single document. (2) Symbol semantics is taken in account. Some preprocessing rules and background terms are considered to make the model more appropriate to fully represent SMS spam. Finally, we compare the MTM with the SVM and the standard LDA on the public SMS spam corpus. The experimental results show that the MTM is more effective for the task of SMS spam filtering.

引用

页码：380 / 383

页数：4

共 50 条

[41] Spam SMS filtering based on text features and supervised machine learning techniques
Muhammad Adeel Abid
Saleem Ullah
Muhammad Abubakar Siddique
Muhammad Faheem Mushtaq
Wajdi Aljedaani
Furqan Rustam
[J]. Multimedia Tools and Applications, 2022, 81 : 39853 - 39871
[42] Text normalization and semantic indexing to enhance Instant Messaging and SMS spam filtering
Almeida, Tiago A.
Silva, Tiago P.
Santos, Igor
Gomez Hidalgo, Jose M.
[J]. KNOWLEDGE-BASED SYSTEMS, 2016, 108 : 25 - 32
[43] Contributions to the study of bi-lingual Roman Urdu SMS Spam filtering
Mehmood, Kashif
Afzal, Hammad
Majeed, Awais
Latif, Hassan
[J]. 2015 NATIONAL SOFTWARE ENGINEERING CONFERENCE (NSEC), 2015, : 42 - 47
[44] Spam SMS filtering based on text features and supervised machine learning techniques
Abid, Muhammad Adeel
Ullah, Saleem
Siddique, Muhammad Abubakar
Mushtaq, Muhammad Faheem
Aljedaani, Wajdi
Rustam, Furqan
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 39853 - 39871
[45] A Vector Space Model based spam SMS filter
Li, Wei
Zeng, Sisheng
[J]. 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE), 2016, : 553 - 557
[46] A Discrete Hidden Markov Model for SMS Spam Detection
Xia, Tian
Chen, Xuemin
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (14):
[47] Efficient spam filtering through intelligent text modification detection using machine learning
Mageshkumar, N.
Vijayaraj, A.
Arunpriya, N.
Sangeetha, A.
[J]. MATERIALS TODAY-PROCEEDINGS, 2022, 64 : 848 - 858
[48] Efficient spam filtering through intelligent text modification detection using machine learning
Mageshkumar, N.
Vijayaraj, A.
Arunpriya, N.
Sangeetha, A.
[J]. MATERIALS TODAY-PROCEEDINGS, 2022, 64 : 848 - 858
[49] Intelligent Security Schema for SMS Spam Message Based on Machine Learning Algorithms
Alshahrani, Ali
[J]. International Journal of Interactive Mobile Technologies, 2021, 15 (16) : 52 - 62
[50] Computing a Comprehensible Model for Spam Filtering
Ruiz-Sepulveda, Amparo
Trivino-Rodriguez, Jose L.
Morales-Bueno, Rafael
[J]. DISCOVERY SCIENCE, PROCEEDINGS, 2009, 5808 : 457 - 464

← 1 2 3 4 5 →