Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence

被引:20
|
作者
Wang, Xuesong [1 ]
Kang, Qi [1 ,2 ]
An, Jing [3 ]
Zhou, Mengchu [4 ]
机构
[1] Tongji Univ, Sch Elect & Informat Engn, Dept Control Sci & Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Shanghai Inst Intelligent Sci & Technol, Shanghai 201804, Peoples R China
[3] Shanghai Inst Technol, Sch Elect & Elect Engn, Shanghai 201418, Peoples R China
[4] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA
基金
中国国家自然科学基金;
关键词
Concept drift; drift detection test; twitter spam classification; K-L divergence; ONLINE;
D O I
10.1109/ACCESS.2019.2932018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter spam classification is a tough challenge for social media platforms and cyber security companies. Twitter spam with illegal links may evolve over time in order to deceive filtering models, causing disastrous loss to both users and the whole network. We define this distributional evolution as a concept drift scenario. To build an effective model, we adopt K-L divergence to represent spam distribution and use a multiscale drift detection test (MDDT) to localize possible drifts therein. A base classifier is then retrained based on the detection result to gain performance improvement. Comprehensive experiments show that K-L divergence has highly consistent change patterns between features when a drift occurs. Also, the MDDT is proved to be effective in improving final classification result in both accuracy, recall, and f-measure.
引用
收藏
页码:108384 / 108394
页数:11
相关论文
共 50 条
  • [41] SEPARATED FLOW PREDICTIONS USING A HYBRID K-L BACKFLOW MODEL
    GOLDBERG, UC
    CHAKRAVARTHY, SR
    AIAA JOURNAL, 1990, 28 (06) : 1005 - 1009
  • [42] Optimal control of batch electrochemical reactor using K-L expansion
    Zhou, XG
    Zhang, XS
    Wang, X
    Dai, YC
    Yuan, WK
    CHEMICAL ENGINEERING SCIENCE, 2001, 56 (04) : 1485 - 1490
  • [43] Detection and analysis of weak target with infrared image based on K-L transform
    Peng, FY
    Zhou, XJ
    Hu, YS
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2001, 20 (03) : 238 - 240
  • [44] Web Spam Detection Using Map Reduce Approach to Collective Classification
    Indyk, Wojciech
    Kajdanowicz, Tomasz
    Kazienko, Przemyslaw
    Plamowski, Slawomir
    INTERNATIONAL JOINT CONFERENCE CISIS'12 - ICEUTE'12 - SOCO'12 SPECIAL SESSIONS, 2013, 189 : 197 - +
  • [45] Email Spam Classification and Detection using Various Machine Learning Classifiers
    Saraswathi, N.
    Pradeep, S.
    Sathiyavathi, V.
    Sabitha, K.
    Kambattan, K. Rajesh
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [46] Detection of Twitter Spam Using GLoVe Vocabulary Features, Bidirectional LSTM and Convolution Neural Network
    Manasa P.
    Malik A.
    Batra I.
    SN Computer Science, 5 (2)
  • [47] Performance Evaluation of Machine Learning Algorithms for Spam Profile Detection on Twitter Using WEKA and RapidMiner
    Hanif, Mohamad Hazim Md
    Adewole, Kayode Sakariyah
    Anuar, Nor Badrul
    Kamsin, Amirrudin
    ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1043 - 1046
  • [48] Early Fault Diagnosis of Rotating Machinery by Combining Differential Rational Spline-Based LMD and K-L Divergence
    Li, Yongbo
    Liang, Xihui
    Yang, Yuantao
    Xu, Minqiang
    Huang, Wenhu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2017, 66 (11) : 3077 - 3090
  • [49] An Integrated Approach to Spam Classification on Twitter Using URL Analysis, Natural Language Processing and Machine Learning Techniques
    Kandasamy, Kamalanathan
    Koroth, Preethi
    2014 IEEE STUDENTS' CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER SCIENCE (SCEECS), 2014,
  • [50] An Optimized Approach for Detection and Classification of Spam Email's Using Ensemble Methods
    Fatima, Rubab
    Fareed, Mian Muhammad Sadiq
    Ullah, Saleem
    Ahmad, Gulnaz
    Mahmood, Saqib
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 139 (01) : 347 - 373