SMS Spam Detection using H2O Framework

被引:16
|
作者
Suleiman, Dima [1 ,2 ]
Al-Naymat, Ghazi [1 ]
机构
[1] Princess Sumaya Univ Technol, Comp Sci Dept, King Hussein Fac Comp Sci, Amman, Jordan
[2] Univ Jordan, Business Informat Technologu Dept, Amman, Jordan
来源
8TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2017) / 7TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2017) / AFFILIATED WORKSHOPS | 2017年 / 113卷
关键词
SMS spam; Random Forest; Naive Bays; Deep Learning; H2O;
D O I
10.1016/j.procs.2017.08.335
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
SMS spams are one of the concerns and many people do not like to receive them since they are annoying. Many SMS spam detection methods already exist and different classifiers were used, such classifiers depended on Support Vector machine, Naive Bays and many other machine learning algorithms. In this paper, new classifier is proposed which depends mainly on using H2O as platform to make comparisons between different machine learning algorithms. Moreover, Machine learning algorithms that are used for comparisons are random forest, deep learning and naive bays. In addition to using deep learning and random forest as classifiers, they are also used to determine the most important features that can be used as input to random forest, deep learning and naive bays classifiers. Experimental results show that the most significant features that can affect the detection of SMS spam are the number of digits and existing of URL in SMS text. The dataset that is used in experiment is the one proposed by UCI Machine Learning Repositories. Therefore, experiments show that the faster algorithm that achieves high performance is naive bays with runtime 0.6 seconds, however after comparing it with deep learning and random forest it has the lowest precision, recall, f-measure and accuracy. On the other hand, random forest is the best in term of accuracy with 50 trees and 20 maximum depths, where precision, recall, f-measure and accuracy are 96%, 86%, 91% and 0.977% respectively; nevertheless the runtime is high 30.28 seconds. (C) 2017 The Authors. Published by Elsevier B.V.
引用
收藏
页码:154 / 161
页数:8
相关论文
共 50 条
  • [1] SMS Spam Detection Using Noncontent Features
    Xu, Qian
    Xiang, Evan Wei
    Yang, Qiang
    Du, Jiachun
    Zhong, Jieping
    IEEE INTELLIGENT SYSTEMS, 2012, 27 (06) : 44 - 51
  • [2] A Spam Transformer Model for SMS Spam Detection
    Liu, Xiaoxu
    Lu, Haoye
    Nayak, Amiya
    IEEE ACCESS, 2021, 9 : 80253 - 80263
  • [3] SMS Spam Detection for Indian Messages
    Agarwal, Sakshi
    Kaur, Sanmeet
    Garhwal, Sunita
    2015 1ST INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2015, : 634 - 638
  • [4] A CNN Model for SMS Spam Detection
    Huang, Taihua
    2019 4TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2019), 2019, : 851 - 861
  • [5] SMS Spam Detection using Selected Text Features and Boosting Classifiers
    Akbari, Fatemeh
    Sajedi, Hedieh
    2015 7TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2015,
  • [6] An Intelligent Framework Based on Deep Learning for SMS and e-mail Spam Detection
    Maqsood, Umair
    Ur Rehman, Saif
    Ali, Tariq
    Mahmood, Khalid
    Alsaedi, Tahani
    Kundi, Mahwish
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2023, 2023
  • [7] A Comparative Study of Spam SMS Detection using Machine Learning Classifiers
    Gupta, Mehul
    Bakliwal, Aditya
    Agarwal, Shubhangi
    Mehndiratta, Pulkit
    2018 ELEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2018, : 287 - 293
  • [8] Multi-Type Feature Extraction and Early Fusion Framework for SMS Spam Detection
    Al-Kabbi, Hussein Alaa
    Feizi-Derakhshi, Mohammad-Reza
    Pashazadeh, Saeid
    IEEE ACCESS, 2023, 11 : 123756 - 123765
  • [9] Graph Centrality Based Spam SMS Detection
    Ishtiaq, Asra
    Islam, Muhammad Arshad
    Iqbal, Muhammad Azhar
    Aleem, Muhammad
    Ahmed, Usman
    PROCEEDINGS OF 2019 16TH INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGY (IBCAST), 2019, : 629 - 633
  • [10] Transfer Naive Bayes Learning using Augmentation and Stacking for SMS Spam Detection
    Ulus, Cihan
    Wang, Zhiqiang
    Iqbal, Sheikh M. A.
    Khan, K. Md. Salman
    Zhu, Xingquan
    2022 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG), 2022, : 275 - 282