Transfer Naive Bayes Learning using Augmentation and Stacking for SMS Spam Detection

被引:2
|
作者
Ulus, Cihan [1 ]
Wang, Zhiqiang [1 ]
Iqbal, Sheikh M. A. [1 ]
Khan, K. Md. Salman [1 ]
Zhu, Xingquan [1 ]
机构
[1] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
基金
美国国家科学基金会;
关键词
Transfer learning; Naive Bayes classification; short message service; SMS; spam detection;
D O I
10.1109/ICKG55886.2022.00042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short Message Service (SMS) spam, unsolicited messages delivered through phones, is common and prevalent, but difficult to filter out. Naive Bayes (NB) classifier is a frequently used spam filtering approach for texts, due to its simple but rigorous statistical learning nature and transparency in the decision making. For SMS messages, simple NB classification is ineffective, because SMS texts are short and brief, often contain numerous typos, abbreviations, and slang words. In this paper, we propose, AstNB, a new Augmentation and Stacking combined Transfer learning approach for Naive Bayes (NB) classification. For effective transfer learning from a source domain, e.g. emails, to a target domain, e.g. SMS, AstNB first introduces data augmentation to generate different copies of training data, by combining a target domain sample with a randomly selected source domain instance, followed by training a number of basis classifiers from augmented data. After that, a stacking process is used to generate new feature space by aggregating predictions of basis classifiers and the feature space created from target data. A final classifier is trained to predict unlabeled SMS messages for spam prediction. Experiments and comparisons show that AstNB can effectively transfer knowledge from source domain for SMS spam detection, especially when the target domain has very few labeled messages.
引用
收藏
页码:275 / 282
页数:8
相关论文
共 50 条
  • [1] Twitter Spam Detection Using Naive Bayes Classifier
    Santoshi, K. Ushasree
    Bhavya, S. Sree
    Sri, Y. Bhavya
    Venkateswarlu, B.
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 773 - 777
  • [2] Less naive Bayes spam detection
    Yang, Hongming
    Stassen, Maurice
    Tjalkens, Tjalling
    [J]. 2007 INFORMATION THEORY AND APPLICATIONS WORKSHOP, 2007, : 386 - +
  • [3] Detecting Spam Emails/SMS Using Naive Bayes, Support Vector Machine and Random Forest
    Goswami, Vasudha
    Malviya, Vijay
    Sharma, Pratyush
    [J]. INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, 2020, 46 : 608 - 615
  • [4] Spam Mail Detection using Naive Bayes method with Apache Spark
    Aydogan, Murat
    Karci, Ali
    [J]. 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [5] Improving the performance of Naive Bayes classifier for spam detection
    Yang, Zhen
    Guo, Jun
    Xu, Weiran
    Chen, Bo
    Hu, Jiani
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 694 - 698
  • [6] Enhancing Spam Detection on Mobile Phone Short Message Service (SMS) Performance using FP-Growth and Naive Bayes Classifier
    Arifin, Dea Delvia
    Shaufiah
    Bijaksana, Moch Arif
    [J]. 2016 IEEE ASIA PACIFIC CONFERENCE ON WIRELESS AND MOBILE (APWIMOB), 2016, : 80 - 84
  • [7] A Comparative Study of Spam SMS Detection using Machine Learning Classifiers
    Gupta, Mehul
    Bakliwal, Aditya
    Agarwal, Shubhangi
    Mehndiratta, Pulkit
    [J]. 2018 ELEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2018, : 287 - 293
  • [8] Email Spam Detection using integrated approach of Naive Bayes and Particle Swarm Optimization
    Agarwal, Kriti
    Kumar, Tarun
    [J]. PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 685 - 690
  • [9] Machine Learning-based spam detection using Naive Bayes Classifier in comparison with Logistic Regression for improving accuracy
    Kumar, K. Varun
    Ramamoorthy, M.
    [J]. JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 : 548 - 554
  • [10] SMS Spam Detection Using Noncontent Features
    Xu, Qian
    Xiang, Evan Wei
    Yang, Qiang
    Du, Jiachun
    Zhong, Jieping
    [J]. IEEE INTELLIGENT SYSTEMS, 2012, 27 (06) : 44 - 51