Efficient email classification approach based on semantic methods

被引:26
|
作者
Bahgat, Eman M. [1 ]
Rady, Sherine [1 ]
Gad, Walaa [1 ]
Moawad, Ibrahim F. [1 ]
机构
[1] Ain Shams Univ, Fac Comp & Informat Sci, Cairo, Egypt
关键词
Email classification; Spam; WordNet ontology; Semantic similarity; Features reduction;
D O I
10.1016/j.asej.2018.06.001
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Emails have become one of the major applications in daily life. The continuous growth in the number of email users has led to a massive increase of unsolicited emails, which are also known as spam emails. Managing and classifying this huge number of emails is an important challenge. Most of the approaches introduced to solve this problem handled the high dimensionality of emails by using syntactic feature selection. In this paper, an efficient email filtering approach based on semantic methods is addressed. The proposed approach employs the WordNet ontology and applies different semantic based methods and similarity measures for reducing the huge number of extracted textual features, and hence the space and time complexities are reduced. Moreover, to get the minimal optimal features' set, feature dimensionality reduction has been integrated using feature selection techniques such as the Principal Component Analysis (PCA) and the Correlation Feature Selection (CFS). Experimental results on the standard benchmark Enron Dataset showed that the proposed semantic filtering approach combined with the feature selection achieves high computational performance at high space and time reduction rates. A comparative study for several classification algorithms indicated that the Logistic Regression achieves the highest accuracy compared to Naive Bayes, Support Vector Machine, J48, Random Forest, and radial basis function networks. By integrating the CFS feature selection technique, the average recorded accuracy for the all used algorithms is above 90%, with more than 90% feature reduction. Besides, the conducted experiments showed that the proposed work has a highly significant performance with higher accuracy and less time compared to other related works. (C) 2018 Production and hosting by Elsevier B.V. on behalf of Ain Shams University.
引用
收藏
页码:3259 / 3269
页数:11
相关论文
共 50 条
  • [41] Rule-Based Graph Repairing: Semantic and Efficient Repairing Methods
    Cheng, Yurong
    Chen, Lei
    Yuan, Ye
    Wang, Guoren
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 773 - 784
  • [42] semantic email addressing (sea) as a solution to email violation in malaysia
    Haryaniharon
    Yusop, Zulkefli Bin Mohd
    Rani, Mohamad Firdaus Bin Che Abdul
    2009 INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY, PROCEEDINGS, 2009, : 376 - +
  • [43] A Painting Image Retrieval Approach Based On Visual Features And Semantic Classification
    Kai, Qian
    2019 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2019, : 195 - 198
  • [44] Semantic Email Addressing The Semantic Web Killer App?
    Kassoff, Michael
    Petrie, Charles
    Zen, Lee-Ming
    Genesereth, Michael
    IEEE INTERNET COMPUTING, 2009, 13 (01) : 48 - 55
  • [45] Fuzzy related classification approach based on semantic measurement for web document
    Zhang, Hui
    Song, Han-Tao
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 615 - +
  • [46] Integration and classification approach based on probabilistic semantic association for big data
    VandanaKolisetty, Vishnu
    Rajput, Dharmendra Singh
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 3681 - 3694
  • [47] Integration and classification approach based on probabilistic semantic association for big data
    Vishnu VandanaKolisetty
    Dharmendra Singh Rajput
    Complex & Intelligent Systems, 2023, 9 : 3681 - 3694
  • [48] A novel binary classification approach based on geometric semantic genetic programming
    Bakurov, I
    Castelli, M.
    Fontanella, F.
    di Freca, A. Scotto
    Vanneschi, L.
    SWARM AND EVOLUTIONARY COMPUTATION, 2022, 69
  • [49] Recommendation of users in social networks: A semantic and social based classification approach
    Berkani, Lamia
    Belkacem, Sami
    Ouafi, Mounira
    Guessoum, Ahmed
    EXPERT SYSTEMS, 2021, 38 (02)
  • [50] Semantic Ontology-Based Approach to Enhance Arabic Text Classification
    Hawalah, Ahmad
    BIG DATA AND COGNITIVE COMPUTING, 2019, 3 (04) : 1 - 14