A Message Topic Model for Multi-Grain SMS Spam Filtering

被引:7
|
作者
Ma, Jialin [1 ,2 ]
Zhang, Yongjun [1 ,2 ]
Wang, Zhijian [2 ]
Yu, Kun [1 ]
机构
[1] Huaiyin Inst Technol, Huaian, Peoples R China
[2] Hohai Univ, Coll Comp & Informat, Nanjing, Jiangsu, Peoples R China
关键词
LDA; MTM; SMS Spam; SVM; Topic Model;
D O I
10.4018/IJTHI.2016040107
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
At present, content-based methods are regard as the more effective in the task of Short Message Service (SMS) spam filtering. However, they usually use traditional text classification technologies, which are more suitable to deal with normal long texts; therefore, it often faces some serious challenges, such as the sparse data problem and noise data in the SMS message. In addition, the existing SMS spam filtering methods usually consider the SMS spam task as a binary-class problem, which could not provide for different categories for multi-grain SMS spam filtering. In this paper, the authors propose a message topic model (MTM) for multi-grain SMS spam filtering. The MTM derives from the famous probability topic model, and is improved in this paper to make it more suitable for SMS spam filtering. Finally, the authors compare the MTM with the SVM and the standard LDA on the public SMS spam corpus. The experimental results show that the MTM is more effective for the task of SMS spam filtering.
引用
收藏
页码:83 / 95
页数:13
相关论文
共 50 条
  • [1] Intelligent SMS Spam Filtering Using Topic Model
    Ma, Jialin
    Zhang, Yongjun
    Liu, Jinling
    Yu, Kun
    Wang, XuAn
    [J]. 2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2016, : 380 - 383
  • [2] Multi-grain sentiment/topic model based on LDA
    Ouyang, Ji-Hong
    Liu, Yan-Hui
    Li, Xi-Ming
    Zhou, Xiao-Tang
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2015, 43 (09): : 1875 - 1880
  • [3] Semantic multi-grain mixture topic model for text analysis
    Zeng, Jianping
    Duan, Jiangjiao
    Wang, Wei
    Wu, Chengrong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (04) : 3574 - 3579
  • [4] Word Embedding Method of SMS Messages for Spam Message Filtering
    Lee, Hyun-Young
    Kang, Seung-Shik
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2019, : 652 - 655
  • [5] Comparison of automated machine learning tools for SMS spam message filtering
    Center for Artificial Intelligence Research , University of Agder, Jon Lilletuns vei 9, Grimstad
    4879, Norway
    [J]. arXiv, 1600,
  • [6] SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder
    Al Moubayed, Noura
    Breckon, Toby
    Matthews, Peter
    McGough, A. Stephen
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 423 - 430
  • [7] Multi-grain hierarchical topic extraction algorithm for text mining
    Zeng, Jianping
    Wu, Chengrong
    Wang, Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) : 3202 - 3208
  • [8] Mobile Spam Filtering base on BTM Topic Model
    Ma, Jialin
    Zhang, Yongjun
    Zhang, Lin
    [J]. ADVANCES ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING, 2017, 1 : 657 - 665
  • [9] SMS spam filtering: Methods and data
    Delany, Sarah Jane
    Buckley, Mark
    Greene, Derek
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9899 - 9908
  • [10] Multi-grain relations
    Barthelemy, Francois
    [J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, 2007, 4783 : 243 - +