Detecting ham and spam emails using feature union and supervised machine learning models

被引:0
|
作者
Furqan Rustam
Najia Saher
Arif Mehmood
Ernesto Lee
Sandrilla Washington
Imran Ashraf
机构
[1] University College Dublin,School of Computer Science
[2] The Islamia University of Bahawalpur,Department of CS and IT
[3] Miami Dade College,College of Engineering and Technology
[4] Spelman College,Department of Computer and Information Sciences
[5] Yeungnam University,Department of Information and Communication Engineering
来源
关键词
Spam detection; Features extraction; Machine learning classifiers; Term frequency; Sampling;
D O I
暂无
中图分类号
学科分类号
摘要
Spam emails are cyber nuisances that cause serious security threats including personal and financial information. Although several spam detection approaches exist, detecting new strains of spam messages is challenging that requires a reliable and efficient intelligent spam email detection approach. This study utilizes features from the text of emails to determine whether it is spam or normal. Multiple features are combined to obtain a higher accuracy for spam email detection. Experiments involve machine learning and deep learning models and the influence of data resampling is also investigated. Performance analysis is done using F1 score, recall, precision, and accuracy, as well as comparison with state-of-the-art approaches. Random forest and logistic regression achieve the highest accuracy scores 0.991 and 0.990, respectively which is much better than existing models.
引用
收藏
页码:26545 / 26561
页数:16
相关论文
共 50 条
  • [1] Detecting ham and spam emails using feature union and supervised machine learning models
    Rustam, Furqan
    Saher, Najia
    Mehmood, Arif
    Lee, Ernesto
    Washington, Sandrilla
    Ashraf, Imran
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (17) : 26545 - 26561
  • [2] Machine Learning-Based Detection of Spam Emails
    Bin Siddique, Zeeshan
    Khan, Mudassar Ali
    Din, Ikram Ud
    Almogren, Ahmad
    Mohiuddin, Irfan
    Nazir, Shah
    [J]. SCIENTIFIC PROGRAMMING, 2021, 2021
  • [3] Detecting Spam Emails/SMS Using Naive Bayes, Support Vector Machine and Random Forest
    Goswami, Vasudha
    Malviya, Vijay
    Sharma, Pratyush
    [J]. INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, 2020, 46 : 608 - 615
  • [4] SMS Spam Filtering using Supervised Machine Learning Algorithms
    Navaney, Pavas
    Dubey, Gaurav
    Rana, Ajay
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE CONFLUENCE 2018 ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING, 2018, : 43 - 48
  • [5] A feature-centric spam email detection model using diverse supervised machine learning algorithms
    Zamir, Ammara
    Khan, Hikmat Ullah
    Mehmood, Waqar
    Iqbal, Tassawar
    Akram, Abubakker Usman
    [J]. ELECTRONIC LIBRARY, 2020, 38 (03): : 633 - 657
  • [6] Detecting Spam Tweets Using Machine Learning and Effective Preprocessing
    Kardas, Berk
    Bayar, Ismail Erdem
    Ozyer, Tansel
    Alhajj, Reda
    [J]. PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2021, 2021, : 393 - 398
  • [7] An Unsupervised Approach for Content-Based Clustering of Emails Into Spam and Ham Through Multiangular Feature Formulation
    Karim, Asif
    Azam, Sami
    Shanmugam, Bharanidharan
    Kannoorpatti, Krishnan
    [J]. IEEE ACCESS, 2021, 9 : 135186 - 135209
  • [8] Ham and Spam E-Mails Classification Using Machine Learning Techniques
    Bassiouni, M.
    Ali, M.
    El-Dahshan, E. A.
    [J]. JOURNAL OF APPLIED SECURITY RESEARCH, 2018, 13 (03) : 315 - 331
  • [9] Interaction between Feature Subset Selection Techniques and Machine Learning Classifiers for Detecting Unsolicited Emails
    Trivedi, Shrawan Kumar
    Dey, Shubhamoy
    [J]. APPLIED COMPUTING REVIEW, 2014, 14 (01): : 53 - 61
  • [10] An improved transformer-based model for detecting phishing, spam and ham emails: A large language model approach
    Jamal, Suhaima
    Wimmer, Hayden
    Sarker, Iqbal H.
    [J]. SECURITY AND PRIVACY, 2024,