Intelligent SMS Spam Filtering Using Topic Model

被引:14
|
作者
Ma, Jialin [1 ,2 ]
Zhang, Yongjun [1 ,2 ]
Liu, Jinling [1 ]
Yu, Kun [1 ]
Wang, XuAn [3 ]
机构
[1] Huaiyin Inst Technol, Huaian, Peoples R China
[2] Hohai Univ, Nanjing, Jiangsu, Peoples R China
[3] CAPF, Engn Univ, Xian, Peoples R China
关键词
SMS Spam; Topic Model; LDA; MTM;
D O I
10.1109/INCoS.2016.47
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, spam messages have been overflowing in many countries. They seriously violate personal rights, and may even harm the national security. The existing filtering techniques usually uses traditional text classifiers, which are more suitable to deal with normal long texts; therefore, it often faces some serious challenges, such as the sparse data problem and noise data in the SMS message. This research work proposes a message topic model (MTM) for SMS spam filtering. The MTM derives from the famous probability topic model. Although the MTM is based on probability topic model, it is different from the famous standard Latent Dirichlet Allocation (LDA) in the following aspects: (1) For the purpose of overcoming the sparsity problem in SMS message classification, first, the standard K-means algorithm is used to classify the training data into rough classes, then, aggregates all the spam messages of a class into a single document. (2) Symbol semantics is taken in account. Some preprocessing rules and background terms are considered to make the model more appropriate to fully represent SMS spam. Finally, we compare the MTM with the SVM and the standard LDA on the public SMS spam corpus. The experimental results show that the MTM is more effective for the task of SMS spam filtering.
引用
收藏
页码:380 / 383
页数:4
相关论文
共 50 条
  • [1] A Message Topic Model for Multi-Grain SMS Spam Filtering
    Ma, Jialin
    Zhang, Yongjun
    Wang, Zhijian
    Yu, Kun
    [J]. INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION, 2016, 12 (02) : 83 - 95
  • [2] SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder
    Al Moubayed, Noura
    Breckon, Toby
    Matthews, Peter
    McGough, A. Stephen
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 423 - 430
  • [3] Mobile Spam Filtering base on BTM Topic Model
    Ma, Jialin
    Zhang, Yongjun
    Zhang, Lin
    [J]. ADVANCES ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING, 2017, 1 : 657 - 665
  • [4] SMS spam filtering: Methods and data
    Delany, Sarah Jane
    Buckley, Mark
    Greene, Derek
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9899 - 9908
  • [5] An SMS Spam Filtering System Using Support Vector Machine
    Joe, Inwhee
    Shim, Hyetaek
    [J]. FUTURE GENERATION INFORMATION TECHNOLOGY, 2010, 6485 : 577 - 584
  • [6] SMS Spam Filtering using Supervised Machine Learning Algorithms
    Navaney, Pavas
    Dubey, Gaurav
    Rana, Ajay
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE CONFLUENCE 2018 ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING, 2018, : 43 - 48
  • [7] A weighted feature enhanced Hidden Markov Model for spam SMS filtering
    Xia, Tian
    Chen, Xuemin
    [J]. NEUROCOMPUTING, 2021, 444 : 48 - 58
  • [8] Spam Filtering of Mobile SMS Using CNN-LSTM Based Deep Learning Model
    Hossain, Syed Md Minhaz
    Sumon, Jayed Akbar
    Sen, Anik
    Alam, Md Iftaker
    Kamal, Khaleque Md Aashiq
    Alqahtani, Hamed
    Sarker, Iqbal H.
    [J]. HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 106 - 116
  • [9] A Review on Mobile SMS Spam Filtering Techniques
    Abdulhamid, Shafi'I Muhammad
    Abd Latiff, Muhammad Shafie
    Chiroma, Haruna
    Osho, Oluwafemi
    Abdul-Salaam, Gaddafi
    Abubakar, Adamu I.
    Herawan, Tutut
    [J]. IEEE ACCESS, 2017, 5 : 15650 - 15666
  • [10] The Evaluation of Ordered Features for SMS Spam Filtering
    Bande Serrano, Jose M.
    Hernandez Palancar, Jose
    Cumplido, Rene
    [J]. PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 383 - 390