A Supervised Approach for Spam Detection Using Text-Based Semantic Representation

被引:3
|
作者
Saidani, Nadjate [1 ]
Adi, Kamel [1 ]
Allili, Mouhand Said [1 ]
机构
[1] Univ Quebec Outaouais, Dept Comp Sci & Engn, Gatineau, PQ, Canada
关键词
Email spam detection; Domain categorization; Semantic features; MODELS;
D O I
10.1007/978-3-319-59041-7_8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose an approach for email spam detection based on text semantic analysis at two levels. The first level allows categorization of emails by specific domains (e.g., health, education, finance, etc.). The second level uses semantic features for spam detection in each specific domain. We show that the proposed method provides an efficient representation of internal semantic structure of email content which allows for more precise and interpretable spam filtering results compared to existing methods.
引用
收藏
页码:136 / 148
页数:13
相关论文
共 50 条
  • [1] Text-Based Spam Tweets Detection Using Neural Networks
    Mardi, Vanyashree
    Kini, Anvaya
    Sukanya, V. M.
    Rachana, S.
    [J]. ADVANCES IN COMPUTING AND INTELLIGENT SYSTEMS, ICACM 2019, 2020, : 401 - 408
  • [2] Spam detection proposal in regular and text-based image emails
    Issac, Biju
    Raman, Valliappan
    [J]. TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 1624 - +
  • [3] Semantic Representation Based on Deep Learning for Spam Detection
    Saidani, Nadjate
    Adi, Kamel
    Allili, Mohand Said
    [J]. FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2019, 2020, 12056 : 72 - 81
  • [4] Spam Comments Detection with Self-Extensible Dictionary and Text-Based Features
    Zhang, Qiang
    Liu, Chenwei
    Zhong, Shangru
    Lei, Kai
    [J]. 2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 1225 - 1230
  • [5] LEARNING SEMANTIC-ALIGNED FEATURE REPRESENTATION FOR TEXT-BASED PERSON SEARCH
    Li, Shiping
    Cao, Min
    Zhang, Min
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2724 - 2728
  • [6] A TEXT-BASED REPRESENTATION FOR PROGRAM VARIANTS
    NARAYANASWAMY, K
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON SOFTWARE CONFIGURATION MANAGEMENT, 1989, 17 : 30 - 37
  • [7] A semantic-based classification approach for an enhanced spam detection
    Saidani, Nadjate
    Adi, Kamel
    Allili, Mohand Said
    [J]. COMPUTERS & SECURITY, 2020, 94
  • [8] Hinky: Defending Against Text-based Message Spam on Smartphones
    Lahmadi, Abdelkader
    Delosiere, Laurent
    Festor, Olivier
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2011,
  • [9] Text-based Malicious Domain Names Detection Based on Variational Autoencoder And Supervised Learning
    Sun, Yuwei
    Chong, Ng S. T.
    Ochiai, Hideya
    [J]. 2020 54TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2020, : 192 - 196
  • [10] Machine Learning Technique for Fake News Detection Using Text-Based Word Vector Representation
    Gaurav, Akshat
    Gupta, B. B.
    Hsu, Ching-Hsien
    Castiglione, Arcangelo
    Chui, Kwok Tai
    [J]. COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021, 2021, 13116 : 340 - 348