Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence

被引:20
|
作者
Wang, Xuesong [1 ]
Kang, Qi [1 ,2 ]
An, Jing [3 ]
Zhou, Mengchu [4 ]
机构
[1] Tongji Univ, Sch Elect & Informat Engn, Dept Control Sci & Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Shanghai Inst Intelligent Sci & Technol, Shanghai 201804, Peoples R China
[3] Shanghai Inst Technol, Sch Elect & Elect Engn, Shanghai 201418, Peoples R China
[4] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA
基金
中国国家自然科学基金;
关键词
Concept drift; drift detection test; twitter spam classification; K-L divergence; ONLINE;
D O I
10.1109/ACCESS.2019.2932018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter spam classification is a tough challenge for social media platforms and cyber security companies. Twitter spam with illegal links may evolve over time in order to deceive filtering models, causing disastrous loss to both users and the whole network. We define this distributional evolution as a concept drift scenario. To build an effective model, we adopt K-L divergence to represent spam distribution and use a multiscale drift detection test (MDDT) to localize possible drifts therein. A base classifier is then retrained based on the detection result to gain performance improvement. Comprehensive experiments show that K-L divergence has highly consistent change patterns between features when a drift occurs. Also, the MDDT is proved to be effective in improving final classification result in both accuracy, recall, and f-measure.
引用
收藏
页码:108384 / 108394
页数:11
相关论文
共 50 条
  • [21] Innovation-based optimal linear attacks and detection under K-L divergence detector in cyber-physical systems
    Zhang, Zihan
    Liu, Bin
    Chen, Jingzhao
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4317 - 4322
  • [22] Estimate the Call Duration Distribution Parameters in GSM System Based on K-L Divergence Method
    Guo, Junqiang
    Liu, Fasheng
    Zhu, Zhiqiang
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 2988 - 2991
  • [23] Discriminant Approach to the GIS Mechanical Fault Diagnosis Based on the K-L Divergence of Vibration Signals
    Hou, Yan
    Zhao, Tong
    Zheng, Hao
    Shi, Lei
    Zou, Liang
    2016 IEEE INTERNATIONAL CONFERENCE ON HIGH VOLTAGE ENGINEERING AND APPLICATION (ICHVE), 2016,
  • [24] Robust Localization for Mobile Robot by K-L Divergence-based Sensor Data Fusion
    Suyama, Keita
    Funabora, Yuki
    Doki, Shinji
    Doki, Kae
    IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 2638 - 2643
  • [25] K-L divergence-based distance measure for Pythagorean fuzzy sets with various applications
    Kumar, Naveen
    Patel, Anjali
    Mahanta, Juthika
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2023,
  • [26] Addressing the class imbalance problem in Twitter spam detection using ensemble learning
    Liu, Shigang
    Wang, Yu
    Zhang, Jun
    Chen, Chao
    Xiang, Yang
    COMPUTERS & SECURITY, 2017, 69 : 35 - 49
  • [27] TwitterGAN: robust spam detection in twitter using novel generative adversarial networks
    Diqi M.
    International Journal of Information Technology, 2023, 15 (6) : 3103 - 3111
  • [28] Quantum K-Nearest-Neighbor Image Classification Algorithm Based on K-L Transform
    Zhou, Nan-Run
    Liu, Xiu-Xun
    Chen, Yu-Ling
    Du, Ni-Suo
    INTERNATIONAL JOURNAL OF THEORETICAL PHYSICS, 2021, 60 (03) : 1209 - 1224
  • [29] Incipient fault diagnosis and amplitude estimation based on K-L divergence with a Gaussian mixture model
    Jiang, Dongnian
    Li, Wei
    Shen, Fuyuan
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2020, 91 (05):
  • [30] Quantum K-Nearest-Neighbor Image Classification Algorithm Based on K-L Transform
    Nan-Run Zhou
    Xiu-Xun Liu
    Yu-Ling Chen
    Ni-Suo Du
    International Journal of Theoretical Physics, 2021, 60 : 1209 - 1224