Detection of SMS Spam Messages Using TF-IDF Vectorizer and Deep Learning Models

被引：0

作者：

Bravo, John Adam, V ^{[1
]}

De Goma, Joel C. ^{[1
]}

Prudente, Springtime ^{[1
]}

Rondilla, Robert Francis A. ^{[1
]}

机构：

[1] Mapua Univ, Manila, Philippines

来源：

PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2024 | 2024年

关键词：

Spam message; ham messages; spam detection; TF-IDF vectorizer; LSTM;

D O I：

10.1145/3654522.3654580

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The spread of SMS spam messages has grown as a major issue, offering a chronic annoyance to mobile phone users worldwide. These unwanted SMS, sent with malevolent purpose or to promote a company, have evolved into increasingly complex schemes. In response to this expanding threat, this article sets out to improve existing SMS spam detection models using deep learning methodology and word embedding techniques. The researchers will emphasize the importance of combating SMS spam and the history of spam filtering, as well as detail the study technique, which includes data pretreatment such as punctuation removal, lowercase conversion, word stemming, and TF-IDF vectorization. The experimental framework employs both LSTM and BiLSTM models, both with and without TF-IDF vectors, for a total of four unique models. Each model is subjected to rigorous K-Fold cross-validation, with the results comparing the efficacy of TF-IDF vectorization in boosting SMS spam detection. This study aims to provide mobile phone users with improved defenses against the harmful threat of SMS spam.

引用

页码：245 / 249

页数：5

共 50 条

[1] Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models
Raza, Ghulam Musa
Butt, Zainab Saeed
Latif, Seemab
Wahid, Abdul
[J]. 2021 INTERNATIONAL CONFERENCE ON DIGITAL FUTURES AND TRANSFORMATIVE TECHNOLOGIES (ICODT2), 2021,
[2] Sentiment analysis of movie reviews based on NB approaches using TF-IDF and count vectorizer
Danyal, Mian Muhammad
Khan, Sarwar Shah
Khan, Muzammil
Ullah, Subhan
Ghaffar, Muhammad Bilal
Khan, Wahab
[J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
[3] SMS Spam Detection for Indian Messages
Agarwal, Sakshi
Kaur, Sanmeet
Garhwal, Sunita
[J]. 2015 1ST INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2015, : 634 - 638
[4] Android Malware Detection in Bytecode Level Using TF-IDF and XGBoost
Ozogur, Gokhan
Erturk, Mehmet Ali
Aydin, Zeynep Gurkas
Aydin, Muhammed Ali
[J]. COMPUTER JOURNAL, 2023, 66 (09): : 2317 - 2328
[5] Automatic Sarcasm Detection in Dialectal Arabic Using BERT and TF-IDF
Mihi, Soukaina
Ben Ali, Brahim Ait
El Bazi, Ismail
Arezki, Sara
Laachfoubi, Nabil
[J]. 6TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS, 2022, 393 : 837 - 847
[6] Using TF-IDF to hide sensitive itemsets
Hong, Tzung-Pei
Lin, Chun-Wei
Yang, Kuo-Tung
Wang, Shyue-Liang
[J]. APPLIED INTELLIGENCE, 2013, 38 (04) : 502 - 510
[7] Using TF-IDF to hide sensitive itemsets
Tzung-Pei Hong
Chun-Wei Lin
Kuo-Tung Yang
Shyue-Liang Wang
[J]. Applied Intelligence, 2013, 38 : 502 - 510
[8] Assessment of Machine Learning Models in Detecting DGA Botnet in Characteristics by TF-IDF
Tong Anh Tuan
Nguyen Viet Anh
Hoang Viet Long
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES (ICMLANT II), 2021, : 79 - 83
[9] Emotion Analysis in Text using TF-IDF
Sundaram, Varun
Ahmed, Saad
Muqtadeer, Shaik Abdul
Reddy, R. Ravinder
[J]. 2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 292 - 297
[10] A detection method for android application security based on TF-IDF and machine learning
Yuan, Hongli
Tang, Yongchuan
Sun, Wenjuan
Liu, Li
[J]. PLOS ONE, 2020, 15 (09):

← 1 2 3 4 5 →