Spam Detection Using Clustering-Based SVM

被引:0
|
作者
Pandya, Darshit [1 ]
机构
[1] Indus Univ, Dept Comp Engn, Ahmadabad 382115, Gujarat, India
关键词
Text Classification; SVM; Clustering;
D O I
10.1145/3366750.3366754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spam detection task is of much more importance than earlier due to the increase in the use of messaging and mailing services. Efficient classification in such a variety of messages is a comparatively onerous task. There are a variety of machine learning algorithms used for spam detection, one of which is Support Vector Machine, also known as SVM. SVM is widely used to classify text-based documents. Though SVM is a widely used technique in document classification, its performance in the spam classification is not the best due to the uneven density of the training data. In order to improve the efficiency of SVM, I introduce a clustering-based SVM method. The training data is pre-processed using clustering algorithms and then the SVM classifier is implemented on the processed dataset. This method would increase the performance by overcoming the problem of uneven distribution of training data. The experimental results show that the performance is improved compared to that of SVM.
引用
收藏
页码:12 / 15
页数:4
相关论文
共 50 条
  • [41] A Clustering-Based Unsupervised Approach to Anomaly Intrusion Detection
    Nikolova, Evgeniya
    Jecheva, Veselina
    PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON COMPUTER, COMMUNICATION, CONTROL AND AUTOMATION, 2013, 68 : 202 - 205
  • [42] A Clustering-Based Method for Intrusion Detection in Web Servers
    Pereira, Hermano
    Jamhour, Edgard
    2013 20TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS (ICT), 2013,
  • [43] A Hybrid Unsupervised Clustering-Based Anomaly Detection Method
    Guo Pu
    Lijuan Wang
    Jun Shen
    Fang Dong
    TsinghuaScienceandTechnology, 2021, 26 (02) : 146 - 153
  • [44] Clustering-based label estimation for network anomaly detection
    Sunhee Baek
    Donghwoon Kwon
    Sang CSuh
    Hyunjoo Kim
    Ikkyun Kim
    Jinoh Kim
    Digital Communications and Networks, 2021, 7 (01) : 37 - 44
  • [45] Clustering-based attack detection for adversarial reinforcement learning
    Majadas, Ruben
    Garcia, Javier
    Fernandez, Fernando
    APPLIED INTELLIGENCE, 2024, 54 (03) : 2631 - 2647
  • [46] A Hybrid Unsupervised Clustering-Based Anomaly Detection Method
    Pu, Guo
    Wang, Lijuan
    Shen, Jun
    Dong, Fang
    TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (02) : 146 - 153
  • [47] Clustering-based label estimation for network anomaly detection
    Baek, Sunhee
    Kwon, Donghwoon
    Suh, Sang C.
    Kim, Hyunjoo
    Kim, Ikkyun
    Kim, Jinoh
    DIGITAL COMMUNICATIONS AND NETWORKS, 2021, 7 (01) : 37 - 44
  • [48] Clustering-Based Subgroup Detection for Automated Fairness Analysis
    Schaefer, Jero
    Wiese, Lena
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 45 - 55
  • [49] A Clustering-Based Algorithm for Automatic Detection of Automobile Dashboard
    Yi, Ming
    Yang, Zhenhua
    Guo, Fengyu
    Liu, Jialin
    IECON 2017 - 43RD ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2017, : 3259 - 3264
  • [50] Clustering based Outlier Detection in Fuzzy SVM
    Sevakula, Rahul K.
    Verma, Nishchal K.
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1172 - 1177