Intelligent SMS Spam Filtering Using Topic Model

被引:14
|
作者
Ma, Jialin [1 ,2 ]
Zhang, Yongjun [1 ,2 ]
Liu, Jinling [1 ]
Yu, Kun [1 ]
Wang, XuAn [3 ]
机构
[1] Huaiyin Inst Technol, Huaian, Peoples R China
[2] Hohai Univ, Nanjing, Jiangsu, Peoples R China
[3] CAPF, Engn Univ, Xian, Peoples R China
关键词
SMS Spam; Topic Model; LDA; MTM;
D O I
10.1109/INCoS.2016.47
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, spam messages have been overflowing in many countries. They seriously violate personal rights, and may even harm the national security. The existing filtering techniques usually uses traditional text classifiers, which are more suitable to deal with normal long texts; therefore, it often faces some serious challenges, such as the sparse data problem and noise data in the SMS message. This research work proposes a message topic model (MTM) for SMS spam filtering. The MTM derives from the famous probability topic model. Although the MTM is based on probability topic model, it is different from the famous standard Latent Dirichlet Allocation (LDA) in the following aspects: (1) For the purpose of overcoming the sparsity problem in SMS message classification, first, the standard K-means algorithm is used to classify the training data into rough classes, then, aggregates all the spam messages of a class into a single document. (2) Symbol semantics is taken in account. Some preprocessing rules and background terms are considered to make the model more appropriate to fully represent SMS spam. Finally, we compare the MTM with the SVM and the standard LDA on the public SMS spam corpus. The experimental results show that the MTM is more effective for the task of SMS spam filtering.
引用
收藏
页码:380 / 383
页数:4
相关论文
共 50 条
  • [21] The Impact of Deep Learning Techniques on SMS Spam Filtering
    Gomaa, Wael Hassan
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (01) : 544 - 549
  • [22] A Composite Intelligent Method for Spam Filtering
    Liu, Jun
    Chen, Shuyu
    Liu, Kai
    Zhou, Yong
    [J]. INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2014, 8 (04): : 67 - 75
  • [23] A CNN Model for SMS Spam Detection
    Huang, Taihua
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2019), 2019, : 851 - 861
  • [24] Content-based Approach for Vietnamese Spam SMS Filtering
    Pham, Thai-Hoang
    Le-Hong, Phuong
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 41 - 44
  • [25] SMS Spam Filtering based on Text Classification and Expert System
    Bozan, Yavuz Selim
    Coban, Onder
    Ozyer, Gulsah Tumuklu
    Ozyer, Baris
    [J]. 2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 2345 - 2348
  • [26] Contributions to the Study of SMS Spam Filtering: New Collection and Results
    Almeida, Tiago A.
    Maria Gomez, Jose
    Yamakami, Akebo
    [J]. DOCENG 2011: PROCEEDINGS OF THE 2011 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2011, : 259 - 262
  • [27] Word Embedding Method of SMS Messages for Spam Message Filtering
    Lee, Hyun-Young
    Kang, Seung-Shik
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2019, : 652 - 655
  • [28] An Intelligent Spam Email Filtering Approach Using a Learning Classifier System
    Al-Ajeli, Ahmed
    Al-Shamery, Eman S.
    Alubady, Raaid
    [J]. INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2022, 22 (03) : 233 - 244
  • [29] Design and Implementation of Intelligent Spam Filtering System
    Gong, Songjie
    Zhang, Xuemei
    [J]. ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1624 - 1627
  • [30] SMS Spam Filtering on Multiple Background Datasets Using Machine Learning Techniques: A Novel Approach
    Kaliyar, Rohit Kumar
    Narang, Pratik
    Goswami, Anurag
    [J]. PROCEEDINGS OF THE 2018 IEEE 8TH INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC 2018), 2018, : 59 - 65