Twitter spam account detection based on clustering and classification methods

被引:0
|
作者
Kayode Sakariyah Adewole
Tao Han
Wanqing Wu
Houbing Song
Arun Kumar Sangaiah
机构
[1] University of Ilorin,Faculty of Communication and Information Sciences
[2] Dongguan University of Technology,DGUT
[3] Shenzhen Institutes of Advanced Technology (SIAT),CNAM Institute
[4] Chinese Academy of Sciences (CAS),CAS Key Laboratory of Human
[5] Embry-Riddle Aeronautical University,Machine Intelligence
[6] Vellore Institute of Technology,Synergy Systems
来源
关键词
Online social network; Spam detection; Fake account; Clustering; Classification;
D O I
暂无
中图分类号
学科分类号
摘要
Twitter social network has gained more popularity due to the increase in social activities of registered users. Twitter performs dual functions of online social network (OSN), acting as a microblogging OSN, and at the same time as a news update platform. Recently, the growth in Twitter social interactions has attracted the attention of cybercriminals. Spammers have used Twitter to spread malicious messages, post phishing links, flood the network with fake accounts, and engage in other malicious activities. The process of detecting the network of spammers who engage in these activities is an important step toward identifying individual spam account. Researchers have proposed a number of approaches to identify a group of spammers. However, each of these approaches addressed a specific category of spammer. This paper proposes a different approach to detect spammers on Twitter based on the similarities that exist among spam accounts. A number of features were introduced to improve the performance of the three classification algorithms selected in this study. The proposed approach applied principal component analysis and tuned K-means algorithm to cluster over 200,000 accounts, randomly selected from more than 2 million tweets to detect the clusters of spammers. Experimental results show that Random Forest achieved the highest accuracy of 96.30%. This result is followed by multilayer perceptron with 96.00% and support vector machine, which achieved 95.60%. The performance of the selected classifiers based on class imbalance also revealed that Random Forest achieved the highest accuracy, precision, recall, and F-measure.
引用
收藏
页码:4802 / 4837
页数:35
相关论文
共 50 条
  • [41] Stochastic Gradient Boosting Model for Twitter Spam Detection
    Devi, K. Kiruthika
    Kumar, G. A. Sathish
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 41 (02): : 849 - 859
  • [42] Machine and Deep Learning Algorithms for Twitter Spam Detection
    Alsaffar, Dalia
    Alfahhad, Amjad
    Alqhtani, Bashaier
    Alamri, Lama
    Alansari, Shahad
    Alqahtani, Nada
    Alboaneen, Dabiah A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2019, 2020, 1058 : 483 - 491
  • [43] An ensemble deep learning model for fast classification of Twitter spam
    Dhar, Suparna
    Bose, Indranil
    INFORMATION & MANAGEMENT, 2024, 61 (08)
  • [44] DON'T FOLLOW ME Spam Detection in Twitter
    Wang, Alex Hai
    SECRYPT 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY, 2010, : 142 - 151
  • [45] 6 Million Spam Tweets: A Large Ground Truth for Timely Twitter Spam Detection
    Chen, Chao
    Zhang, Jun
    Chen, Xiao
    Xiang, Yang
    Zhou, Wanlei
    2015 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2015, : 7065 - 7070
  • [46] Graph-Based Methods for Clustering Topics of Interest in Twitter
    Hromic, Hugo
    Prangnawarat, Narumol
    Hulpus, Ioana
    Karnstedt, Marcel
    Hayes, Conor
    ENGINEERING THE WEB IN THE BIG DATA ERA, 2015, 9114 : 701 - 704
  • [47] Spam Detection Utilizing Statistical-Based Bayesian Classification
    Zhao, Xianghui
    Zhang, Yangping
    Yi, Junkai
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, SIMULATION AND MODELLING, 2016, 41 : 327 - 330
  • [48] A semantic-based classification approach for an enhanced spam detection
    Saidani, Nadjate
    Adi, Kamel
    Allili, Mohand Said
    COMPUTERS & SECURITY, 2020, 94
  • [49] Aspect-based classification method for review spam detection
    Cai, Mengsi
    Du, Yonghao
    Tan, Yuejin
    Lu, Xin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 20931 - 20952
  • [50] Lazy associative classification for content-based spam detection
    Veloso, Adriano
    Meira, Wagner, Jr.
    LA-WEB 06: FOURTH LATIN AMERICAN WEB CONGRESS, PROCEEDINGS, 2006, : 154 - +