Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence

被引:20
|
作者
Wang, Xuesong [1 ]
Kang, Qi [1 ,2 ]
An, Jing [3 ]
Zhou, Mengchu [4 ]
机构
[1] Tongji Univ, Sch Elect & Informat Engn, Dept Control Sci & Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Shanghai Inst Intelligent Sci & Technol, Shanghai 201804, Peoples R China
[3] Shanghai Inst Technol, Sch Elect & Elect Engn, Shanghai 201418, Peoples R China
[4] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA
基金
中国国家自然科学基金;
关键词
Concept drift; drift detection test; twitter spam classification; K-L divergence; ONLINE;
D O I
10.1109/ACCESS.2019.2932018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter spam classification is a tough challenge for social media platforms and cyber security companies. Twitter spam with illegal links may evolve over time in order to deceive filtering models, causing disastrous loss to both users and the whole network. We define this distributional evolution as a concept drift scenario. To build an effective model, we adopt K-L divergence to represent spam distribution and use a multiscale drift detection test (MDDT) to localize possible drifts therein. A base classifier is then retrained based on the detection result to gain performance improvement. Comprehensive experiments show that K-L divergence has highly consistent change patterns between features when a drift occurs. Also, the MDDT is proved to be effective in improving final classification result in both accuracy, recall, and f-measure.
引用
收藏
页码:108384 / 108394
页数:11
相关论文
共 50 条
  • [31] Fast image compression using matrix K-L transform
    Zhang, DQ
    Chen, SC
    NEUROCOMPUTING, 2005, 68 : 258 - 266
  • [32] Opinion spam detection framework using hybrid classification scheme
    Asghar, Muhammad Zubair
    Ullah, Asmat
    Ahmad, Shakeel
    Khan, Aurangzeb
    SOFT COMPUTING, 2020, 24 (05) : 3475 - 3498
  • [33] Opinion spam detection framework using hybrid classification scheme
    Muhammad Zubair Asghar
    Asmat Ullah
    Shakeel Ahmad
    Aurangzeb Khan
    Soft Computing, 2020, 24 : 3475 - 3498
  • [34] Twitter spam drift detection by semi supervised learning approach using YATSI algorithm
    Sivakumar, P.
    Balasubramani, M.
    Sowndharya, R.
    Priya, B. S. Deepa
    Priya, W. Deva
    Syamala, Maganti
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024,
  • [35] Spam detection on Twitter using a support vector machine and users' features by identifying their interactions
    Ahmad, Saleh Beyt Sheikh
    Rafie, Mahnaz
    Ghorabie, Seyed Mojtaba
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 11583 - 11605
  • [36] Spam detection on Twitter using a support vector machine and users’ features by identifying their interactions
    Saleh Beyt Sheikh Ahmad
    Mahnaz Rafie
    Seyed Mojtaba Ghorabie
    Multimedia Tools and Applications, 2021, 80 : 11583 - 11605
  • [37] A Watermarking Strategy Against Linear Deception Attacks on Remote State Estimation Under K-L Divergence
    Wang, Di
    Huang, Jiahao
    Tang, Yang
    Li, Fangfei
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (05) : 3273 - 3281
  • [38] Detection of Social Botnet using a Trust Model based on Spam Content in Twitter Network
    Lingam, Greeshma
    Rout, Rashmi Ranjan
    Somayajulu, D. V. L. N.
    2018 IEEE 13TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (IEEE ICIIS), 2018, : 280 - 285
  • [39] A VMD-PE-SG denoising method based on K-L divergence for satellite atomic clock
    Liang, Yifeng
    Xu, Jiangning
    Li, Fangneng
    Wu, Miao
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (05)
  • [40] UNDERWATER ACOUSTIC IMAGE ENHANCEMENT USING WAVELET AND K-L TRANSFORM
    Priyadharsini, R.
    Sharmila, T. Sree
    Rajendran, V.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 563 - 567