Detection of SMS Spam Messages Using TF-IDF Vectorizer and Deep Learning Models

被引:0
|
作者
Bravo, John Adam, V [1 ]
De Goma, Joel C. [1 ]
Prudente, Springtime [1 ]
Rondilla, Robert Francis A. [1 ]
机构
[1] Mapua Univ, Manila, Philippines
关键词
Spam message; ham messages; spam detection; TF-IDF vectorizer; LSTM;
D O I
10.1145/3654522.3654580
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The spread of SMS spam messages has grown as a major issue, offering a chronic annoyance to mobile phone users worldwide. These unwanted SMS, sent with malevolent purpose or to promote a company, have evolved into increasingly complex schemes. In response to this expanding threat, this article sets out to improve existing SMS spam detection models using deep learning methodology and word embedding techniques. The researchers will emphasize the importance of combating SMS spam and the history of spam filtering, as well as detail the study technique, which includes data pretreatment such as punctuation removal, lowercase conversion, word stemming, and TF-IDF vectorization. The experimental framework employs both LSTM and BiLSTM models, both with and without TF-IDF vectors, for a total of four unique models. Each model is subjected to rigorous K-Fold cross-validation, with the results comparing the efficacy of TF-IDF vectorization in boosting SMS spam detection. This study aims to provide mobile phone users with improved defenses against the harmful threat of SMS spam.
引用
收藏
页码:245 / 249
页数:5
相关论文
共 50 条
  • [1] Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models
    Raza, Ghulam Musa
    Butt, Zainab Saeed
    Latif, Seemab
    Wahid, Abdul
    [J]. 2021 INTERNATIONAL CONFERENCE ON DIGITAL FUTURES AND TRANSFORMATIVE TECHNOLOGIES (ICODT2), 2021,
  • [2] Sentiment analysis of movie reviews based on NB approaches using TF-IDF and count vectorizer
    Danyal, Mian Muhammad
    Khan, Sarwar Shah
    Khan, Muzammil
    Ullah, Subhan
    Ghaffar, Muhammad Bilal
    Khan, Wahab
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [3] SMS Spam Detection for Indian Messages
    Agarwal, Sakshi
    Kaur, Sanmeet
    Garhwal, Sunita
    [J]. 2015 1ST INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2015, : 634 - 638
  • [4] Android Malware Detection in Bytecode Level Using TF-IDF and XGBoost
    Ozogur, Gokhan
    Erturk, Mehmet Ali
    Aydin, Zeynep Gurkas
    Aydin, Muhammed Ali
    [J]. COMPUTER JOURNAL, 2023, 66 (09): : 2317 - 2328
  • [5] Automatic Sarcasm Detection in Dialectal Arabic Using BERT and TF-IDF
    Mihi, Soukaina
    Ben Ali, Brahim Ait
    El Bazi, Ismail
    Arezki, Sara
    Laachfoubi, Nabil
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS, 2022, 393 : 837 - 847
  • [6] Using TF-IDF to hide sensitive itemsets
    Hong, Tzung-Pei
    Lin, Chun-Wei
    Yang, Kuo-Tung
    Wang, Shyue-Liang
    [J]. APPLIED INTELLIGENCE, 2013, 38 (04) : 502 - 510
  • [7] Using TF-IDF to hide sensitive itemsets
    Tzung-Pei Hong
    Chun-Wei Lin
    Kuo-Tung Yang
    Shyue-Liang Wang
    [J]. Applied Intelligence, 2013, 38 : 502 - 510
  • [8] Assessment of Machine Learning Models in Detecting DGA Botnet in Characteristics by TF-IDF
    Tong Anh Tuan
    Nguyen Viet Anh
    Hoang Viet Long
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES (ICMLANT II), 2021, : 79 - 83
  • [9] Emotion Analysis in Text using TF-IDF
    Sundaram, Varun
    Ahmed, Saad
    Muqtadeer, Shaik Abdul
    Reddy, R. Ravinder
    [J]. 2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 292 - 297
  • [10] A detection method for android application security based on TF-IDF and machine learning
    Yuan, Hongli
    Tang, Yongchuan
    Sun, Wenjuan
    Liu, Li
    [J]. PLOS ONE, 2020, 15 (09):