SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder

被引:13
|
作者
Al Moubayed, Noura [1 ]
Breckon, Toby [1 ]
Matthews, Peter [1 ]
McGough, A. Stephen [1 ]
机构
[1] Univ Durham, Sch Engn & Comp Sci, Durham DH1 3LE, England
关键词
D O I
10.1007/978-3-319-44781-0_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In This paper we present a novel approach to spam filtering and demonstrate its applicability with respect to SMS messages. Our approach requires minimum features engineering and a small set of labelled data samples. Features are extracted using topic modelling based on latent Dirichlet allocation, and then a comprehensive data model is created using a Stacked Denoising Autoencoder (SDA). Topic modelling summarises the data providing ease of use and high interpretability by visualising the topics using word clouds. Given that the SMS messages can be regarded as either spam (unwanted) or ham (wanted), the SDA is able to model the messages and accurately discriminate between the two classes without the need for a pre-labelled training set. The results are compared against the state-of-the-art spam detection algorithms with our proposed approach achieving over 97% accuracy which compares favourably to the best reported algorithms presented in the literature.
引用
收藏
页码:423 / 430
页数:8
相关论文
共 50 条
  • [1] Intelligent SMS Spam Filtering Using Topic Model
    Ma, Jialin
    Zhang, Yongjun
    Liu, Jinling
    Yu, Kun
    Wang, XuAn
    [J]. 2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2016, : 380 - 383
  • [2] A Message Topic Model for Multi-Grain SMS Spam Filtering
    Ma, Jialin
    Zhang, Yongjun
    Wang, Zhijian
    Yu, Kun
    [J]. INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION, 2016, 12 (02) : 83 - 95
  • [3] Auxiliary Stacked Denoising Autoencoder based Collaborative Filtering Recommendation
    Mu, Ruihui
    Zeng, Xiaoqin
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (06): : 2310 - 2332
  • [4] Stacked Denoising Autoencoder-based Deep Collaborative Filtering Using the Change of Similarity
    Suzuki, Yosuke
    Ozaki, Tomonobu
    [J]. 2017 31ST IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (IEEE WAINA 2017), 2017, : 498 - 502
  • [5] Network Intrusion Detection Using Stacked Denoising Autoencoder
    Park, Seongchul
    Seo, Sanghyun
    Kim, Juntae
    [J]. ADVANCED SCIENCE LETTERS, 2017, 23 (10) : 9907 - 9911
  • [6] Using Stacked Denoising Autoencoder for the Student Dropout Prediction
    Kuo, Jong Yih
    Pan, Chia Wei
    Lei, Baiying
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2017, : 483 - 488
  • [7] An SMS Spam Filtering System Using Support Vector Machine
    Joe, Inwhee
    Shim, Hyetaek
    [J]. FUTURE GENERATION INFORMATION TECHNOLOGY, 2010, 6485 : 577 - 584
  • [8] SMS Spam Filtering using Supervised Machine Learning Algorithms
    Navaney, Pavas
    Dubey, Gaurav
    Rana, Ajay
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE CONFLUENCE 2018 ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING, 2018, : 43 - 48
  • [9] Probabilistic Stacked Denoising Autoencoder for Power System Transient Stability Prediction With Wind Farms
    Su, Tong
    Liu, Youbo
    Zhao, Junbo
    Liu, Junyong
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (04) : 3786 - 3789
  • [10] Hybrid SMS Spam Filtering System Using Machine Learning Techniques
    Baaqeel, Hind
    Zagrouba, Rachid
    [J]. 2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,