Spam detection on Twitter using a support vector machine and users' features by identifying their interactions

被引:14
|
作者
Ahmad, Saleh Beyt Sheikh [1 ]
Rafie, Mahnaz [2 ]
Ghorabie, Seyed Mojtaba [3 ]
机构
[1] Arvandan Nonprofit Higher Educ Inst, Dept Comp Engn, Khorramshahr, Iran
[2] Islamic Azad Univ, Ramhormoz Branch, Dept Comp Engn, Ramhormoz, Iran
[3] Islamic Azad Univ, Int Branch, Dept Comp Engn, Qeshm, Iran
关键词
Tweet; Twitter; Spam; Support vector machine;
D O I
10.1007/s11042-020-10405-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spam tweets might cause numerous problems for users. An automatic method is introduced as a proposed method to detect spam tweets. This method is based on pre-processing and feature extraction steps. The pre-processing step is significant for our problem due to the specific structure of tweets. The pre-processing step is performed in such a way that after which only the words remain in each tweet that can play a key role in determining whether the tweet is spam or non-spam. In the proposed method, the features are classified into five classes of user profile features, account information features, user activity based features, user interaction based features, and tweet content-based features including 28 different features. In the feature selection step, an optimal subset of these features is selected for the learning process. However, a support vector classifier is used for the learning process by two Gaussian and polynomial kernels. Finally, the proposed method is compared with multi-layer perceptron (MLP), Naive Bayes (NB), random forest (RF), and k-nearest neighbors (KNN) methods in terms of standard criteria. The obtained results show the superiority of the proposed method using support vector machine (SVM) algorithm and polynomial kernel with 0.988 precision, 0.953 efficiency, 0.96 accuracy, F-0.969, and 0.985 ROC area under the curve compared to the other methods, indicating that the proposed method has better performance overall.
引用
收藏
页码:11583 / 11605
页数:23
相关论文
共 50 条
  • [1] Spam detection on Twitter using a support vector machine and users’ features by identifying their interactions
    Saleh Beyt Sheikh Ahmad
    Mahnaz Rafie
    Seyed Mojtaba Ghorabie
    [J]. Multimedia Tools and Applications, 2021, 80 : 11583 - 11605
  • [2] Spam Email Detection Using Deep Support Vector Machine, Support Vector Machine and Artificial Neural Network
    Roy, Sanjiban Sekhar
    Sinha, Abhishek
    Roy, Reetika
    Barna, Cornel
    Samui, Pijush
    [J]. SOFT COMPUTING APPLICATIONS, SOFA 2016, VOL 2, 2018, 634 : 162 - 174
  • [3] Identifying important features for intrusion detection using discriminant analysis and support vector machine
    Wong, Wai-Tak
    Lai, Cheng-Yang
    [J]. PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 3563 - +
  • [4] Competence Classification of Twitter Users Using Support Vector Machine (SVM) Method
    Rifaldi, Muhammad Haqqi Ghufran
    Setiawan, Erwin Budi
    [J]. 2019 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2019, : 292 - 297
  • [5] Support Vector Machine Based Spam SMS Detection
    Tekerek, Adem
    [J]. JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2019, 22 (03): : 779 - 784
  • [6] Improving cyberbullying detection using Twitter users' psychological features and machine learning
    Balakrishnan, Vimala
    Khan, Shahzaib
    Arabnia, Hamid R.
    [J]. COMPUTERS & SECURITY, 2020, 90
  • [7] Adversarial Spam Detection Using the Randomized Hough Transform-Support Vector Machine
    DeBarr, Dave
    Sun, Hao
    Wechsler, Harry
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, 2013, : 299 - 304
  • [8] How Spam Features Change in Twitter and the Impact to Machine Learning Based Detection
    Wu, Tingmin
    Wang, Derek
    Wen, Sheng
    Xiang, Yang
    [J]. INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2017, 2017, 10701 : 898 - 904
  • [9] Machine Learning for the Detection of Spam in Twitter Networks
    Wang, Alex Hai
    [J]. E-BUSINESS AND TELECOMMUNICATIONS, 2012, 222 : 319 - 333
  • [10] Using of support vector machines for link spam detection
    Sharapov, Ruslan V.
    Sharapova, Ekaterina V.
    [J]. INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2011), 2011, 8285