Intelligent SMS Spam Filtering Using Topic Model

被引:14
|
作者
Ma, Jialin [1 ,2 ]
Zhang, Yongjun [1 ,2 ]
Liu, Jinling [1 ]
Yu, Kun [1 ]
Wang, XuAn [3 ]
机构
[1] Huaiyin Inst Technol, Huaian, Peoples R China
[2] Hohai Univ, Nanjing, Jiangsu, Peoples R China
[3] CAPF, Engn Univ, Xian, Peoples R China
关键词
SMS Spam; Topic Model; LDA; MTM;
D O I
10.1109/INCoS.2016.47
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, spam messages have been overflowing in many countries. They seriously violate personal rights, and may even harm the national security. The existing filtering techniques usually uses traditional text classifiers, which are more suitable to deal with normal long texts; therefore, it often faces some serious challenges, such as the sparse data problem and noise data in the SMS message. This research work proposes a message topic model (MTM) for SMS spam filtering. The MTM derives from the famous probability topic model. Although the MTM is based on probability topic model, it is different from the famous standard Latent Dirichlet Allocation (LDA) in the following aspects: (1) For the purpose of overcoming the sparsity problem in SMS message classification, first, the standard K-means algorithm is used to classify the training data into rough classes, then, aggregates all the spam messages of a class into a single document. (2) Symbol semantics is taken in account. Some preprocessing rules and background terms are considered to make the model more appropriate to fully represent SMS spam. Finally, we compare the MTM with the SVM and the standard LDA on the public SMS spam corpus. The experimental results show that the MTM is more effective for the task of SMS spam filtering.
引用
收藏
页码:380 / 383
页数:4
相关论文
共 50 条
  • [11] Thai-English Spam SMS Filtering
    Khemapatapan, Chaiyaporn
    [J]. 2010 16TH ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS (APCC 2010), 2010, : 226 - 230
  • [12] SMS Spam Filtering Based on "Cloud Security"
    Wu, Hongli
    Jiang, Yonghui
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 2015 - 2019
  • [13] Hybrid SMS Spam Filtering System Using Machine Learning Techniques
    Baaqeel, Hind
    Zagrouba, Rachid
    [J]. 2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
  • [14] Using Evolutionary Learning Classifiers To Do Mobile Spam (SMS) Filtering
    Junaid, M. Bilal
    Farooq, Muddassar
    [J]. GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1795 - 1801
  • [15] A Spam Transformer Model for SMS Spam Detection
    Liu, Xiaoxu
    Lu, Haoye
    Nayak, Amiya
    [J]. IEEE ACCESS, 2021, 9 : 80253 - 80263
  • [16] A Method of SMS Spam Filtering Based on AdaBoost Algorithm
    Zhang, Xipeng
    Xiong, Gang
    Hu, Yuexiang
    Zhu, Fenghua
    Dong, Xisong
    Nyberg, Timo R.
    [J]. PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 2328 - 2332
  • [17] Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique
    Choudhary, Neelam
    Jain, Ankit Kumar
    [J]. ADVANCED INFORMATICS FOR COMPUTING RESEARCH, ICAICR 2017, 2017, 712 : 18 - 30
  • [18] Spam Goes Mobile: Filtering Unsolicited SMS Traffic
    Androulidakis, Iosif
    Vlachos, Vasileios
    Papanikolaou, Alexandros
    [J]. 2012 20TH TELECOMMUNICATIONS FORUM (TELFOR), 2012, : 1452 - 1455
  • [19] The Impact of Feature Extraction and Selection on SMS Spam Filtering
    Uysal, A. K.
    Gunal, S.
    Ergin, S.
    Gunal, E. Sora
    [J]. ELEKTRONIKA IR ELEKTROTECHNIKA, 2013, 19 (05) : 67 - 72
  • [20] Simple SMS spam filtering on independent mobile phone
    Nuruzzaman, M. Taufiq
    Lee, Changmoo
    bin Abdullah, Mohd. Fikri Azli
    Choi, Deokjai
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2012, 5 (10) : 1209 - 1220